Tuesday, June 04, 2024

Can you retain free will at individual and societal level? How much of it do we currently have to compare against getting more free will as a possibility instead of losing it.

Sunday, June 02, 2024

Actuators and sensors

If LLMs can be paired with actuators and sensors ie robots or drones that are connected to the internet, then they can avoid pizza glue situations by actually trying out when feasible or talking to pizza making robots. This calls for general audience cooks that share their experiences like the blog minded than the chef trying to keep their recipe a secret.

LLMs can then delearn what is not useful and readjust weights.

Then we can have more of Move 37 in Go game that made sense only later on. Geoffrey Hinton in the video talks about three concepts of language.

Imagine a biosphere of robots, a passage of rite before they can be released in the wild real world where the scenarios are not all that they can be prepared for.

This vision is also referred to as physical intelligence.

How humans learn

The first you see something - you might find it weird. The second time you see the same thing , you come to see it as a thing. eg

This is a sample pin macro statement in LEFDEF provided as sample by chatGPT.

text,label

"PIN VDD",PIN

" DIRECTION INOUT ;",PIN

" USE POWER ;",PIN

"END VDD",PIN

"MACRO cell1",OTHER

" CLASS CORE ;",OTHER

" SIZE 10 BY 10 ;",OTHER

" SITE core_site ;",OTHER

"END cell1",OTHER

Saturday, June 01, 2024

David Tong: Lectures on the Standard Model

standard model

David Tong: Lectures on the Standard Model

Diffusion index

Chicago PMI Purchaser Manager Index is a diffusion index that shows the strngth of the manufacturing economy.

Stockmarket diffusion index

Friday, May 31, 2024

AI weekly

Synthetic data: phi models, newest stable diffusion.

MoEs: megablocks, llava-moe.

PEFT: eg DoRA, vera - Vector-based Random Matrix Adaptation

Instruction tuning: alpaca,moe+IT paper from Google.

VLMs: Apple and HF papers,

LLM embeddings: eg llm2vec,

239

Dwarkesh with sholto-douglas-trenton-bricken

sholto-douglas-trenton-bricken

AI scaling

grokking

monosemanticity

sparse penalty

Distilled models

"one hot vector that says, “this is the token that you should have predicted.”"

chain-of-thought as adaptive compute.

key value weights

Roam notes on dwarkesh patel conversation with sholto douglas trenton bricken

Tuesday, May 28, 2024

NeuroAI

CS375

FSD without convnets

Vision Transformers compared with CNN and its need for large datatset and their inductive biases. Swin (shifted window) ViT.

"CNN is even backbones behind some of the non-grid signal processing networks like equivariant nn, graphCNN and pointnet for point cloud etc."

Alternate and hybrid architectures possibly being used by Tesla FSD instead of CNNs.

A whole-slide foundation model for digital pathology from real-world data - GigaPath, a novel vision transformer for pretraining large pathology foundation models on gigapixel pathology slides.

Building human level intelligence with neuroanatomy approach or the parts of the brain approach where you build artificial cerebral cortex as in

"1.LLMs are basically the prefrontal cortex. 2.Tesla built something akin to a parietal and occipital cortex."

Lecture 14 - vision transformers from efficientml.ai MIT course.

VILA-1.5, an efficient visual language model (VLM) that can understand not only images but also videos.

Apple's Dataset Decomposition: Faster LLM Training with Variable Sequence Length Curriculum tries to avoid a limitation of current LLM - Fixed token usage regardless of problem difficulty. fixed token usage regardless of problem difficulty

Efficient ML Harvard seminar - learning to learn

Monday, May 27, 2024

Foundation models in generative AI

Scaling theory podcast with Yann Le Cunn

Foundation models will be customised per use case instead of a giant catch all model pan languages. Building AI models is faster and cheaper than you probably think. Y combinator companies used two levers of better architectus or lesser data to reduce computation.

They are presumed to have high impact when the cumulative amount of compute used for its training exceeds 10^25 floating point operations (FLOPs),[23] - EU Law but you have learned just above that the models will try to not use much compute.

Regulation could be a threat to open models of Meta. Open models imply oversight and hence safer AI.

open models

open software stack

open OS - Linux servers, Apache server side frameworks

Pytorch is open

While finetuning the foundation models per language is a task. John Schulman in a chat with Dwarkesh Patel mentioned an interesting finding that if you do all your fine-tuning with English data, the model will automatically behave well in other languages. This can be extended to leverage this in robots too. The collaborators’ theory is that learning about the physical world in one robot body should help an AI to operate another — in the same way that learning in English can help a language model to generate Chinese, because the underlying concepts about the world that the words describe are the same.

Schulman says that a version of this with multimodal data where if you do text-only fine-tuning, you also get reasonable behavior with images.

Its language time. All languages need to prvide their data open source. If linguists can point out common rules of language, this can get furhter and faster.

Small scale AI startups fine tuning a foundation model should show a figure of merit.

Is Vision foundation model the future instead of billions of parameters? Considering that humans know more with so little data than what these moels are trained on in few years like a four year old.

How to become an AI tutor

AI Tutor

generating consistently high-quality and accurately labeled data through various methods to facilitate the training of NLP algorithms.

Natural Language Processing with Deep Learning

gain the skills to move from word representation and syntactic processing to

designing and implementing complex deep learning models for

question answering, machine translation, and other language understanding tasks

Design, implement, and understand your NLP neural network models, using the Pytorch framework.
Deep learning with pytorch tutorial

Build the model tutorial

create datasets for model training, benchmarking, and overall advancemen

Represent word meaning with word vectors, such as Word2Vec, SVD and GloVe.

SVD and other algorithms

Identify semantic relationships between words in a sentence with dependency parsing.

Dependency parsing

Make large scale word predictions with language models, recurrent neural networks (RNNs), and neural machine translation.
Improve your NLP models and pretrain your transformers for more efficient natural language processing and understanding.

Speech and Language Processing

divide by 3 counter

schmitt-trigger-vs-simple-inverter

Multi-scale system design

Complexity Theory in Axiomatic Design

Four different types of complexity are identified in axiomatic design complexity theory:

time-independent real complexity,

time- independent imaginary complexity,

time-dependent combinatorial complexity and

time-dependent periodic complexity.

AI/ML Hardware engineer

Neural Engine HW Architect

machine learning algorithms, Transformer, convolutional neural networks and their applications in Generative AI, NLP, computer vision and image/video processing

nalyzing ML workloads on different HW architecture — profiling and identifying the performance bottleneck in the system, coming up with suggestions for performance improvement either at algorithm, SW and HW level.

Understanding the system implications of aforementioned algorithms in terms of performance and power on a given HW architecture.
Knowledge in LLM pipeline, image processing, camera pipeline, computational photography and natural language processing

Fabric/Interconnect Engineer

Interconnect and Wire Engineering

Design Margin, Reliability and Scaling

Principles of VLSI design

Wednesday, February 14, 2024

DFT course

dft course

Textbook

 Essentials of Electronic Testing for Digital, Memory and Mixed-Signal VLSI Circuits by M. L. Bushnell and V.D. Agrawal, Kluwer Academic Press, Boston 2000

Lectures

Recommended

 System On Chip Test Architectures: Nanometer Design for Testability by L.T. Wang, C.E. Stroud, N. A. Touba, Elsevier, Morgan Kaufmann Publishers, 2009.

 Digital System Testing and Testable Design by M. Abramovici, M. A. Breuer, and A.D. Friedman, IEEE Press, New York, 1990, 652 pages

Digital Systems testing

Tuesday, February 13, 2024

Design for AT-Speed Test, Diagnosis and Measurement

Design for at speed

Adhyayan

Tuesday, June 04, 2024

What could go wrong?

Sunday, June 02, 2024

Actuators and sensors

How humans learn

Saturday, June 01, 2024

David Tong: Lectures on the Standard Model

David Tong: Lectures on the Standard Model

Diffusion index

Friday, May 31, 2024

AI weekly

Dwarkesh with sholto-douglas-trenton-bricken

Tuesday, May 28, 2024

NeuroAI

FSD without convnets

Monday, May 27, 2024

Foundation models in generative AI

How to become an AI tutor

Multi-scale system design

AI/ML Hardware engineer

Fabric/Interconnect Engineer

Wednesday, February 14, 2024

DFT course

Tuesday, February 13, 2024

Design for AT-Speed Test, Diagnosis and Measurement

Design for AT-Speed Test, Diagnosis and Measurement

About Me

Popular Posts