Tyna <> AI
Tyna <> AI
Home
Learning Posts
Final Project
Publications
Contact
Light
Dark
Automatic
Posts
Multiple Experts, Multiple Objectives
At long last, I present my Scholars project, where I engineered a (somewhat primitive) framework that disentangles data containing many behaviors from different experts to learn to steer a model towards one mode of behavior or another.
Last updated on Jul 10, 2021
15 min read
The Final Stretch
As I round the corner of the final days of my project, I am striving to succeed in provably demonstrating its practical utility, simplicity and contributions. The neural architecture I have designed so far, however, is anything but simple.
Last updated on Mar 16, 2021
1 min read
Exploring New Depths
Over the last two weeks I have been delving into new depths, turning a problem over in my mind for days at a time, without certainty of success. Continuing to probe at an idea in the face of possible failure can be daunting, and I wanted to share some of the tips that helped me overcome the challenges of designing novel solutions.
Last updated on Mar 2, 2021
3 min read
The Makings of an Option
Reinforcement learning literature involves learning to pursue actions that provide sufficient enough rewards (or minimize the agent’s cost). As we have previously seen, encouraging continuous actions defined as an “environment step” can be tricky because of the credit assignment problem, wherein the learning function must attribute credit for rewards or costs to some actions taken along a trajectory**.
Last updated on Feb 16, 2021
5 min read
Making and Benchmarking a Clone
The Expert We trained multiple experts at different thresholds and constraints, but in this report we will discuss a configuration set (alias Marigold), for which we ran multiple smaller cloning experiments.
Last updated on Feb 1, 2021
3 min read
Learning with Constraints
Reinforcement Learning as a field has advanced in lockstep with advances in compute power. Iterative generation of tricks improved performance by learning the values of actions. The value function is therefore necessary for choosing actions.
Last updated on Jan 18, 2021
4 min read
Experiments and Logging
My high-level goal over the last week or so involved finishing the write-up for trajectory collection for constrained and unconstrained agents. Immediately upon starting training, I butted up against the question of how and when to log experiments and agent performance.
Last updated on Jan 4, 2021
2 min read
Learning from OpenAI Experts
An emergent trend of my posts so far has been my attempt to link my progress through curriculum study to concepts in artificial intelligence. The end of my learning is no different.
Last updated on Dec 5, 2020
5 min read
Learning with Rewards
Reinforcement learning is the subset of machine learning in which an agent exists within an environment and looks to maximize some kind of reward. The agent takes an action, which alters the environment in some way, observes the reward associated with the environmental change.
Last updated on Nov 23, 2020
3 min read
80:20 Learning
The past 2 weeks have been a blur, wherein I read papers that significantly advanced my understanding of the evolution of deep learning research and, more broadly, how to learn.
Last updated on Mar 16, 2021
16 min read
»
Cite
×