(Machine learning experts, please go easy on me. Readers, please read this with a grain of salt. The analogy I'm about to make here is a loose one, but is interesting, nevertheless.)
The other day, I stumbled upon a set of tweets from @nickcammarata, who I find to be an insightful mind in the AI space.
In his tweets, he presented an intriguing perspective on life:
“Living life as gradient descent is so relaxing. No need to worry about holding everything at once, just look at things when they come up and improve them a step at a time when you can
Having to hold your whole life at once to analyze the overall structure and figure out what the best overall structure is and how to go from here to there sounds heavy”
But what was even more interesting to me was this particular discussion that followed in the replies:
I found the last tweet in this interaction to be particularly enlightening. It's a simple idea, but the analogy is a fun one.
But to fully appreciate the wisdom embedded in this tweet, let me help you understand what gradient descent is, in layman's language.
Gradient descent is a fundamental concept in the field of machine learning and optimization, describing a method to find the minimum of a function.
Imagine you're standing on a hilly terrain with valleys and peaks, and your goal is to reach the lowest point. With gradient descent, you'd look around, identify the direction where the slope is steepest downhill, and take a step in that direction. Repeat this process, and you'd gradually approach the lowest point, or the ‘global minimum.’
Now, here's where things get interesting.
In more complex situations, say, a terrain with numerous hills, valleys, and ridges, you can imagine these as ‘local minima’ — spots that are lower than their immediate surroundings, but not the lowest point overall. It's like finding a comfortable dip in the landscape but missing out on the deep, serene valley just a few hills away. In lower dimensions, it's easy for the algorithm to get stuck in these local minima.
But as the number of dimensions (or variables) increases, it becomes less likely to get trapped in these local dips. This is because the algorithm has more directions to explore, and hence more opportunities to escape these local minima and continue the journey to the global minimum. The algorithm is usually trying to minimize some sort of error or loss function: a measure of how well the model's predictions match the actual data.
Also, as dimensions increase, the probability of encountering a saddle point (a point where some dimensions are at a local minimum and others are at a local maximum) increases. This is because each dimension of the function can be increasing or decreasing independently of the others. A local maximum in one or more dimensions can easily be offset by a local minimum in another dimension, turning the point into a saddle point rather than a true local maximum.
Now, here's the analogy:
If we apply this concept to life, we can think of our reality as a high-dimensional space, composed of numerous variables or dimensions. These could be our relationships, career, health, hobbies, values, etc. When we seek improvement, it's as if we're performing a gradient descent operation in this high-dimensional space, taking steps in the direction of steepest descent, continually improving our situation in life.
Consider your career as an example, especially when you're just starting out. It's easy to get swayed by the salient dimensions — brand name of the company, the size of the paycheck, the prestige associated with certain roles. You're like a traveler in a foreign land, attracted by the glitz and glamour of the well-lit attractions.
You might be drawn to a job at a high-profile tech company that pays well and is known worldwide. It feels like a local maximum: you're earning a good salary, your LinkedIn profile looks impressive, and your parents can't stop bragging about you.
But then, you start to realize that you're working long hours, you're not particularly passionate about the product you're working on, and you don't feel a strong connection with your colleagues.
In a way, I have made a similar analogy to the greedy algorithm in the past.
If you haven't read it already, a greedy algorithm makes the best choice at each step as it attempts to find a global optimum. But this approach can often lead to suboptimal solutions — like getting stuck in a local maximum. It's akin to choosing a job just because it offers the highest salary right now, without considering other important factors.
In contrast, if your model of reality included more dimensions — such as your true inclinations, the people you'll be working with, the product you'll be contributing to, and how well you fit with the company's culture and mission — you would have a more comprehensive view of your career landscape.
You might realize that a job with a slightly lower salary at a smaller company, but where you truly believe in the product and enjoy your colleagues, could be a better fit for you. In the multidimensional space of your career, this could be your global maximum — a place where you're not just earning a living, but truly living your passion.
By adding more dimensions to your model, you increase the chances of escaping local maxima and finding a more globally optimal solution — a career that not only pays well but also aligns with your values, plays to your strengths, and brings you joy, satisfaction, and a sense of purpose.
It's about understanding that your career is a multivariate problem and aiming to increase the resolution of your model, rather than living with a low-res model that focuses solely on the most obvious indicators of success.
And once your model of reality is sufficiently high-dimensional, you arrive at a point where @nickcammarata's original set of tweets starts making a lot more sense:
“Living life as gradient descent is so relaxing. No need to worry about holding everything at once, just look at things when they come up and improve them a step at a time when you can
Having to hold your whole life at once to analyze the overall structure and figure out what the best overall structure is and how to go from here to there sounds heavy”
As there's no fear of getting stuck in a local maximum, you simply optimize and solve problems as and when they come, knowing that it is almost impossible for you to get stuck and stagnate. You always find yourself with dimensions you can improve on. And you can work on them as and when you spot an opportunity for a fix, without living with the burden of having a “best overall structure,” as Nick puts it — which can feel heavy and unattainable and bog you down with expectations.
I share this not as a definitive guide to life but as a gentle proposal — an interesting way to think about life and careers: a reminder that life, like a high-dimensional space, is full of directions to explore. And while we may sometimes find ourselves in comfortable dips, let's not forget about the serene valleys that could be just a few hills away. As Nick suggests, life is a gradient descent, a continuous journey of improvement, one step at a time.
So, here's to taking that next step as and when it comes, and looking forward to wherever it may lead us.