Shining

Star


  • Home

  • Categories

  • About

  • Archives

  • Tags

RL-14 Policy Evaluation state-of-the-art

Posted on 2018-12-22 | In Reinforcement Learning

fitting Value function

Read more »

RL-13 tips for reinforcement learning

Posted on 2018-10-17 | In Reinforcement Learning

MDP

Read more »

variational inference and application

Posted on 2018-10-10 | In Machine Learning

理解并推导变分推断,并应用于variational linear regression,variational EM,variational autoencoder, Weight Uncertainity in Neural Networks, VIME。

Read more »

Convex-Optimization-Summary

Posted on 2018-07-08 | In Optimization

等待填坑!!!!

Read more »

Logic Programing with Artificial Intelligence

Posted on 2018-06-18 | In AI

今早刷知乎时看到了这样一个问题, 为什么 AI 理解不了逻辑问题?这学期在学习形式化验证这门课时我接触到了大量的逻辑推理,当时也思考过这个问题.当下state of the art 的AI模型普遍是基于统计的模型,无论是贝叶斯流派还是深度学习,其核心都是统计模型,需要大量的数据以及频繁的训练.这带来了两个直观的问题:

Read more »

VSCode C++ 开发环境搭建

Posted on 2018-06-11 | In IDE

我未来会大量写C++,因此需要一个顺手的IDE,但是VS Studio过于臃肿,并且linux下还没法使用,DEV过于简陋。权衡过后选择了VSCode,但Linux下仍会使用Clion。以下记录VS Code C++开发环境搭建流程。

Read more »

RL-12 transfer learning

Posted on 2018-06-04 | In Reinforcement Learning
  1. The benefits of sharing knowledge across tasks
  2. The transfer learning problem in RL
  3. Transfer learning with source and target domains
  4. Next time: multi-task learning, meta-learning
Read more »

RL-11 Exploration

Posted on 2018-05-28 | In Reinforcement Learning

RL中永远存在的一个问题是exploration & exploitation trade off,这节课从多臂摇臂机模型(stateless)出发,探讨了Optimism-based / Posterior matching / Information-theoretic 三类探索方法的原理与实践,并推广到了DRL的情况。

Read more »

RL-10 Inverse Reinforcement Learning

Posted on 2018-05-20 | In Reinforcement Learning

在上一节的基础上,引入逆强化学习,用于解决无法直接获取reward的情况,并且为后面的exploration与transfer learning做铺垫。

Read more »

RL-9 Connections Between Inference and Control

Posted on 2018-05-17 | In Reinforcement Learning

用概率图的观点建立了优化控制与强化学习(Q-learning,policy gradient)之间联系,也可以解释人类动作,并且其中的inference部分是逆强化学习(Inverse Reinforcement Learning)的基础。

Read more »
1 2 3
hackerHugo

hackerHugo

But we're just beautiful people with beautiful problems.

26 posts
11 categories
RSS
Github 知乎 Twitter Steam
Friends
  • TripleZ
© 2018 hackerHugo
Powered by Jekyll
Theme - NexT.Mist