ML etc.

February 16, 2025

......

roboticscontroloptimization vs CV –> LLM

https://www.zhihu.com/question/22031360/answer/20092442

-——-

https://www.zhihu.com/question/40120261/answer/3605671072

既然动力学控制这么多优势 , 那么为什么工业上多是运动学控制的机器人产品呢 ?

由于其实现难度较高 & 成本较大 & 以及在大多数工业应用中运动学控制已能满足需求 , 工业机器人系统普遍采用运动学控制 , 避免大炮打蚊子 , 节约成本

运动学控制系统相对简单 , 易于实现 , 调试和维护 , 且具备较高的稳定性和鲁棒性 , 能够确保工业任务中所需的精确位置控制

运动学控制技术在工业领域已非常成熟 , 相关软硬件生态系统完善 , 符合行业标准和实践 , 进一步推动了其广泛应用 , 任何产品都依靠生态 , 比如微信在中国通讯软件的地位

https://www.zhihu.com/question/530581913/answer/3607297565

[https://www.zhihu.com/question/633124010/answer/3598256957]

-——-

[https://www.zhihu.com/question/402251886/answer/1344947652]

model based RL 和传统的 optimal control 并无不同 , 只是换了个名称而已 , 虽然听上去 RL 要给出一个 global control law 很厉害的样子 , 但绝大部分的系统实际不能可靠做到 , 于是最近几年 DRL 就主张用 Deep nets 去 fit 这样的control law ( or the value function )

至于能不能 work , 那就看诸位对 ML 的信仰是否虔诚了 ; 至于 MPC 就朴实的多 , 数值优化至少能返回一个 local optimal , 有的时候可以证明 local optimal 就能达到理想的控制效果了 , 而利用优化问题的稀疏性 , mpc 优化算法可以做到非常实时 ; 当然 , 对于复杂的非凸问题 , 靠局部的数值优化也是不够的 ;

-——-

基于 MPC 的车辆控制器 , 现在车载芯片算力强大 , MPC 的计算量可以忽略不计
在无人机上实现 MPC 姿态控制 , 飞控是基于树莓派写的 , 预测 30 步的情况下 , 解 qp 耗时不到 1ms , 对我来说实时性已经够了

现在这个算力爆炸的时代 , 可以尽情拥抱高算力平台和高算力算法了 , 什么 MPC 和 RL 不实用，都是老一代的思想，没记错的话波士顿动力的四足和双足机器人 , MIT 的开源机器狗 , 都是实时跑高维 MPC ( ? )

( DeepMind ) Towards General and Autonomous Learning of Core Skills: A Case Study in Locomotion
[https://www.semanticscholar.org/paper/Towards-General-and-Autonomous-Learning-of-Core-A-Hafner-Hertweck/afeffb9e05d89b2ac806282d3ed4366d67e4392e]

DeepMind 这个工作没有任何 MPC , 就靠设计一个多层次的 reward function 然后不断 training , DeepMind 一直是致力于通用智能 ( ? )

相比之下 ETH 这个结构复杂的多 , 还需要 MPC 和一堆传统控制用的方法 , 效果上来说似乎 ETH 好一点 ( 从地形上看 ) , 但是似乎还是比不过波士顿动力的人工调的 MPC

- https://leggedrobotics.github.io/rl-blindloco

- https://leggedrobotics.github.io/rl-perceptiveloco

- https://arxiv.org/abs/2309.14246v2 https://sites.google.com/leggedrobotics.com/risk-aware-locomotion Learning Risk-Aware Quadrupedal Locomotion using Distributional Reinforcement Learning

https://github.com/TinyMPC/TinyMPC

-——-

https://zhuanlan.zhihu.com/p/504256899

https://zhuanlan.zhihu.com/p/115726898

https://www.zhihu.com/question/650407644/answer/3448786787

degrees of freedom

https://www.zhihu.com/question/597238433/answer/3080541702

============

log

RWKV-7 as a meta-in-context learner 模型的内部世界必须持续拟合外部世界 从这个第一性原理 (?) 就可以直接写出 RWKV-7 的精确公式 https://zhuanlan.zhihu.com/p/9397296254

https://www.rwkv.com/images/RWKV-7.png https://github.com/BlinkDL/RWKV-LM/tree/main/RWKV-v7

RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 “Goose”.
combining the best of RNN and transformer – great performance, fast inference, fast training, saves VRAM, “infinite” ctxlen, and free text embedding. Moreover it's 100% attention-free, and a Linux Foundation AI project. https://www.rwkv.com/

I believe RNN is a better candidate for fundamental models, because: (1) It's more friendly for ASICs (no kv cache). (2) It's more friendly for RL. (3) When we write, our brain is more similar to RNN. (4) The universe is like an RNN too (because of locality). Transformers are non-local models.

All the trained models will be open-source. Inference is very fast (only matrix-vector multiplications, no matrix-matrix multiplications) even on CPUs, so you can even run a LLM on your phone.

RWKV is parallelizable because the time-decay of each channel is data-independent (and trainable).

RWKV-7 uses parallelized mode to quickly generate the state, then use a finetuned full RNN (the layers of token n can use outputs of all layer of token n-1) for sequential generation.

Were RNNs All We Needed? https://news.ycombinator.com/item?id=41732853 https://arxiv.org/abs/2410.01201

Dec 18, 2024

============

https://news.ycombinator.com/item?id=42405323 Phi-4: Microsoft's Newest Small Language Model Specializing in Complex Reasoning > The most interesting thing about this is the way it was trained using synthetic data, which is described in quite a bit of detail in the technical report: https://arxiv.org/abs/2412.08905
https://www.theverge.com/2024/12/13/24320811/what-ilya-sutskever-sees-openai-model-data-training

Dec 23, 2024

============

OpenAI 面临的一大问题是缺乏多元化高质量的数据 , 公共互联网没有足够的数据用于训练另一个问题是人才流失

https://openai.com/12-days/?day=12 https://slashdot.org/story/24/12/22/0333225/openais-next-big-ai-effort-gpt-5-is-behind-schedule-and-crazy-expensive

Dec 31, 2024

============

Does this mean you don't need large GPU clusters for frontier LLMs? No but you have to ensure that you're not wasteful with what you have ... https://www.solidot.org/story?sid=80186 DeepSeek 称其新模型只花了 550 万美元训练

https://github.com/orgs/deepseek-ai/repositories?type=all

V2 https://www.zhihu.com/question/655172528/answer/3490846800

Jan 6

====== ======

DeepSeek-V3 目前性价比最高 https://github.com/deepseek-ai/DeepSeek-V3/blob/main/DeepSeek_V3.pdf

Math

1. o3 scored 96.7% in the AIME math competition (up from 83.3% for o1), improving by 13.4 percentage points.
1. o3 achieved 87.7% accuracy on the PhD-level GPQA Diamond benchmark, well over the average of 70% for PhD experts.
1. For the Epic AI Frontier Math Benchmark, o3 is the only AI model to exceed 25% accuracy, while others typically score under 2%.
Extremely Expensive: The cost of o3 is currently 1000 times that of o1. According to the ARC-AGI testing standards, solving one problem with o3 in low mode costs around $20, but in high mode, the cost can skyrocket to around $3,440 per task. For instance, asking “Which is greater, 9.09 or 9.11” would set you back $1,600,250 for a total of 400 + 100 tasks in the ARC-AGI test.

https://x.com/asankhaya/status/1858714149557334481

Competitive coding

Competitive coding is usually difficult, but not complex. Software Engineering is usually complex, but not difficult. https://www.reddit.com/r/programming/comments/1hir7lb/the_new_openai_model_o3_scores_better_than_998_of/

https://www.reddit.com/r/singularity/comments/1fh5683/a_codeforces_user_used_o1mini_in_a_live_contest/

SWE-bench

Can Language Models Resolve Real-world Github Issues? https://openreview.net/forum?id=VTF8yNQM66

https://www.reddit.com/r/LocalLLaMA/comments/1hc276t/gemini_20_flash_beating_claude_sonnet_35_on/

Jan 24

============

// 1. jump-game

https://sara-hy.github.io/2019/01/10/leetcode-JumpGame/

https://builtin.com/software-engineering-perspectives/jump-game-leetcode

// 2.1 https://github.com/PKUanonym/REKCARC-TSC-UHT/tree/master/%E5%A4%A7%E4%B8%80%E5%B0%8F%E5%AD%A6%E6%9C%9F/Week_2-Socket/hw

// 2.2 五子棋 self-play https://ne7ermore.github.io/post/alpha-zero/

https://www.nature.com/articles/nature24270.epdf?author_access_token=VJXbVjaSHxFoctQQ4p2k4tRgN0jAjWel9jnR3ZoTv0PVW4gB86EEpGqTRDtpIz-2rmo8-KG06gqVobU5NSCFeHILHcVFUeMsbvwS-lxjqQGg98faovwjxeTUgZAUMnRQ

// 3. https://techxplore.com/tags/board+games/

https://www.science.org/doi/10.1126/sciadv.adg3256

** https://deepmind.google/discover/blog/

我已经被淘汰了 ......

============

// – 1. https://openreview.net/forum?id=XYK1eGjahp

https://m.youtube.com/watch?v=o7uac6DuzcQ

How our work differs from the above-mentioned results: Many of the above papers are focused on problems in P or P/poly, while 3-SAT is an NP-complete problem. It is widely believed that P is a strict subset of NP, and it is not known whether NP is a subset of P/poly. In other words, our results are not comparable to these earlier results.

Meanwhile, Pérez et al. (2019), Li et al. (2024), and Merrill & Sabharwal (2024) also show that Transformers can simulate single-tape Turing Machines ™ with CoT and can theoretically be extended to arbitrary decidable languages. However, these constructions require at least one CoT token for every step of TM execution.

By contrast, our theoretical construction demonstrates that, for certain classes of formal reasoning problems, Transformers can simulate algorithmic reasoning traces at an abstract level with drastically reduced number of CoT tokens compared to step-wise emulation of a single-tape TM. At each CoT Step, our construction performs deductive reasoning over the full input in parallel while any single-tape TM must process each input token sequentially.

Furthermore, the CoT produced by our theoretical construction abstractly represents the human reasoning process of trial and error, as demonstrated in Figure 1.

// – 2. > 比如 2023 年 ICLR 一篇文章提到了 Transformer 架构有能力实现几种不同的线性回归算法 , 这个正向结果再次表明其通用性 https://x.com/xleaps/status/1627873094814531584

& https://twitter.com/xleaps/status/1627844991824297984

网络开箱范例 : 训练用神经网络计算同余加法 a+b = ? (mod c) 时 , 网络在某个时间突然获得了 100% 准确率用傅立叶变换来计算同余加法这个算法可以被证明正确 & 反人类直觉 ( ? )

在这个例子里 , 约 1400 步的训练使网络从简单的记住训练数据转移到傅立叶变换 , 约 9000 步的训练使得傅立叶变换完美实现了同余加法 , 而剩余的几千步使网络清理了之前的记忆 , 把权重全部转移实现了傅立叶变换算法这个子网络上

LLM 可以看成是我们无力掌控涌现而采取的 heuristics

// – 3. SFT overfitting : short term RL generalize : long term https://pbs.twimg.com/media/Gje3D_AbYAAuBfP?format=jpg&name=large

============

https://blog.computationalcomplexity.org/2025/02/research-then-and-now.html

Low hanging fruit : computational complexity was just 20 years old when I started grad school ... the P v NP problem You didn't need to know deep math and most graduate students could follow nearly all of the 47 talks at my first conference ( STOC 1986 ), not likely in the STOC 2025 papers.

Now you need to spend time climbing the trees and going down deep branches to find new problems that only people on nearby branches would care about or even understand.

But who knows, AI may soon climb those branches for you.

============

// – 1. AI 引起认知退化 : 需量化测量 https://www.solidot.org/story?sid=80529 – https://www.microsoft.com/en-us/research/publication/the-impact-of-generative-ai-on-critical-thinking-self-reported-reductions-in-cognitive-effort-and-confidence-effects-from-a-survey-of-knowledge-workers/

// – 2. https://x.com/wwwyesterday/status/1886363047255773357

// – 3. DeepSeek-R1 : jailbreak 100% https://x.com/rohanpaul_ai/status/1886025249273339961

// – 4. > 有分析认为 , DeepSeek 使用的英伟达 GPU 市场价格比美国企业使用的尖端产品便宜 1-3 成左右 , DeepSeek 使用了 2000-3000 块 H800 ( H100 的中国特供版 ) 开发出了 AI 模型 V3 , 使用的 GPU 总额单纯计算约合人民币 3.85 亿元 – 7.21 亿元 ; DeepSeek 解释称 , V3 的开发费用为 557.6 万美元 , 假设让 AI 学习 278.8 万小时 , 每小时费用为 2 美元 , 这个费用与美国的 AI 模型相比不到十分之一 ; 东京大学教授松尾丰指出 : AI 模型的开发需要数十次数百次的反复试验 , 在约 280 万小时的学习之前也花费了时间 , 这样考虑符合逻辑 , 花在这上面的时间和 GPU 本来就应该纳入成本之中 ; https://d9shhjt4p7ouc.cloudfront.net/china/ccompany/57993-2025-02-11-10-33-49.html

// – 5. how「信任」? survey 形式未排除 bias ? 没有细分量化 https://www.solidot.org/story?sid=80555

============

out-dated

https://x.com/wwwyesterday/status/1873785958891741412

AI解决前端（或者也包括一些后端）代码或许比较好用，因为逻辑相对简单并且训练数据量大。但对于非前端（比如C/C++或许其他语言）可能因为训练数据规模小得多，并不如前端那么好用（实际上即便前端，更专业一些的场景也并非网上吹得那么好用）。即便AI写了，你还是要看懂理解代码改bug调试的。

有人觉得对于项目整体代码行数规模的预估，犹如天方夜谭。那我只能觉得很遗憾。当然，并不是每个程序员都需要有这个能力，但是如果作为哪怕是小团队 leader，对要完成的项目（产品）规模的数量级（不用也不可能很精确）没有底，那只能说是不合格的。

如果30年都是寫差不多的東西（換句話說這東西也相當成熟了）那確實可以估準 , 新東西不可能

你自己的项目，所以可以这么估。重点不是代码，重点是需求

AI 需求评估器 ?

Jan 06, 2025

============

https://polarisxu.studygolang.com/posts/basic/diagram-float-point/

https://www.zhihu.com/question/511856781/answer/2890338282

> 将0.96亿个浮点数写到一个文件当时他们的C++代码大概需要了300秒虽然C++我已经十来年没有用过了 , 最后优化到了不到2秒

他们当时手上的项目 , 是扫描一些形状简单的零件 , 需要拟合出边的形状长度厚度 etc. 最后形成一个用户指定格式的文件在后续的合作中发现 , 他们说的算法 , 是通过三角函数立体几何的知识推导出来的公式 , 懂点编程的人都明白 , 这在现实编程中没有意义

虽然 , 之前我没有任何计算机视觉处理方面的知识 , 还是学习了OpenCV , 解决了关键的一些问题 , 让这个项目可以交付

90% 是机械 or 建筑之类的人 https://caadxyz.github.io/blog/python4rhino-zh

要么 CV CG 的算法工程师 ( 学校的人大概率不是 ) https://m.douban.com/book/subject/1841346/ 数学算法不是「计算机算法 or 计算机工程」

============

Jan 6

信息技术这个概念很大 , 需要 coder 的不仅仅是互联网 , 还有传统软件业 , erp 行业 , 自动化行业 , 银行业等等 , 这些行业的知识折旧速度不尽相同 , 银行业使用古董 cobol , 已经很多年了 , 淘汰速度极慢 ; 自动化嵌入式的 C 几乎无可取代 , 传统软件业就杂多了 , 主要还是 Java 等 C 系列的语言 , 知识折旧速度已经比较快了 , 但是还是可以接受 , 因为传统软件业的基本都面相特定的客户 , 面相大众的产品占比小 ; 互联网 , 每件产品都希望面相全球 60 亿人 , 复杂的环境 , 用户众多 , 需求变更极快

这几类产品虽然都是 coder 的工作 , 但是技能需求完全不同 , 银行业需要数据的一致性 , 安全性 , 速度可以牺牲 , 需求变化谨慎 , 需要极高的业务理解程度 , 行业准入门槛不在技术 , 在业务 , 银行业有专属的银行系统测试

职位 , 据我了解 , 一般只收有同行经验的 , 不是说你技术牛逼就要

传统软件业需要一定的变化性 , 需要深入了解业务需求 , 比如一个航空结算系统 , 你就需要搞明白结算是怎么运作 , 需求变化的频率不高 , 传统软件业有「系统分析师」这类角色 , 半类似产品经理 , 但是更技术一些 , 同时也考虑软件的技术难点 , 兼顾需求变化

互联网天天变 , 今天出一个活动页 , 一晚上就下架 , 你跟大家谈系统分析 , 分析个毛 , 做完流量能撑住不宕机 , 时效第一 , 有点不痛不痒的 bug 根本不管 , 服务器挂了只要重启能解决的话 , 先上线 , 以后再说（好多以后也懒得追）

https://www.zhihu.com/question/41328565/answer/92389789 横向 scaling https://x.com/ingramchen/status/1699739008723808629#m

static strong type , algebraic data type 门下走狗

============

Jan 6

Using AI Generated Code Will Make You a Bad Programmer | Hacker News https://news.ycombinator.com/item?id=41913458

前端更「难」源于「复杂系统」属性迭代快

以前不分前后端资本注入互联网才催生横向 scaling ( 平台垄断 , 流量注意力经济 )

Front-end technologies are not “progressing rapidly”, they are thrashing. People are (rightfully) extremely frustrated with the limitations of the tools they have, so they (wrongfully) go off and invent new tools that get around those limitations, while inevitably re-introducing tons of difficulty in whatever the tools they're replacing were developed to avoid. https://news.ycombinator.com/item?id=37300579

跟「复杂系统」「人」打交道就是吃力不讨好的工作

没钱还 PTSD : 全世界最烂工作 https://www.36kr.com/p/1725124296705

要不是出于「信仰道德」 https://www.zhihu.com/question/284312614/answer/465879747

数学 : 过度简化的「模型」 , 有控制感就不会抑郁

============

Jan 6

编译器和逆向工程的人 ( 专家 ) 不会被替代

Recent work shows that LLMs demonstrate unequivocally superior results in being able to explain and resolve compiler error messages—for decades, one of the most frustrating parts of learning how to code. However, LLM-generated error message explanations have only been assessed by expert programmers in artificial conditions.

This work sought to understand how novice programmers resolve programming error messages (PEMs) in a more realistic scenario. We ran a within-subjects study with n = 106 participants in which students were tasked to fix six buggy C programs. For each program, participants were randomly assigned to fix the problem using either a stock compiler error message, an expert-handwritten error message, or an error message explanation generated by GPT-4.

Despite promising evidence on synthetic benchmarks, we found that GPT-4 generated error messages outperformed conventional compiler error messages in only 1 of the 6 tasks, measured by students’ time-to-fix each problem.

Handwritten explanations still outperform LLM and conventional error messages, both on objective and subjective measures. https://dl.acm.org/doi/fullHtml/10.1145/3689535.3689554 Not the Silver Bullet: LLM-enhanced Programming Error Messages are Ineffective in Practice

============

two major “tech” waves boosted with capitalism

- 1. web , front-end , cloud // attention-grasping economy & privacy surveillance : ad blocker , RSS & VPN , sino app

https://anvaka.github.io/map-of-github/#10.31/-19.0547/-34.1612

- 2. AI

============