Open AI recentl released their 4o image generation model. GPT-4o image model differs from previous diffusion models in that it is:
Multimodal-native: Unlike diffusion models that generate images from text prompts only, 4o can directly understand and generate across text, images, and audio in a unified architecture.
Non-diffusion-based: It doesn’t use a step-by-step denoising process like Stable Diffusion or DALL·E 2. Instead, image reasoning and generation are integrated more like language modeling, allowing for faster and more flexible interaction.
This has led to a giant step up in usability of this model. The long prompts of Midjourney days are gone and we can now collaborate more closely with the model for our desired outputs.
A eye
A short story
Of immortality
Vibe coding is a new paradigm from early 2025 which essentially refers to writing software with the help of LLMS, without actually writing any of the code yourself.
As the videos below demonstrate, this is not something that is just for a junior programmer. Many seasoned programmers are moving over to this paradigm as the efficiency gains are just incomparable to writing code by themselves.
As of current, a technical background and systems thinking is still helpful to guide the development process. Nonetheless the difference between what it meant to be a programmer 2 years ago till today is quite whiplash inducing.
Vibe Coding in 2025 will do to software what MidJourney has done for image generation since 2023. Meaning there will be a massive amount of output, but getting to a final shippable product will still required tenacity and reliance on traditional skills.
DeepSeek-R1 represents a major breakthrough in AI development, not just for its impressive performance but for the significant cost reductions it introduces. Unlike many large-scale models that require massive computational resources, DeepSeek has managed to develop a model on par with OpenAI’s leading systems at a fraction of the cost. This efficiency makes high-performance AI more accessible, opening doors for businesses, researchers, and developers who previously faced prohibitive expenses when integrating advanced AI into their work.
By dramatically lowering the cost of AI inference and training, DeepSeek-R1 could drive widespread adoption across industries, from healthcare and finance to education and creative fields. Companies that once relied on expensive proprietary models may now have access to open-source alternatives without compromising on quality. This shift not only democratizes AI but also increases competition, pushing the industry toward more sustainable and cost-effective innovation. If this trend continues, AI deployment could become significantly cheaper, leading to a future where high-quality AI assistance is a standard tool rather than a luxury reserved for the largest tech companies.
The immediate takeaway is that many use cases that were not deemed viable just recently, are immediately much more feasible.
Chain of Thought (CoT) prompting has already proven to be a powerful method for improving the reasoning capabilities of large language models (LLMs) by breaking down complex problems into intermediate logical steps. This approach not only enhances accuracy but also makes AI decision-making more transparent and interpretable.
OpenAI’s latest research on Learning to Reason with LLMs builds on this idea, demonstrating that explicit reasoning techniques—such as self-reflection, verification, and structured problem-solving—can further optimize AI performance. Instead of merely predicting an answer based on surface-level patterns, LLMs can be trained to reason step by step, much like a human working through a problem.
By integrating reasoning models with CoT, we move toward AI systems that don’t just generate responses but actively “think” through challenges in a structured way. This has major implications for fields that require rigorous logical processing, such as mathematics, scientific research, law, and medical diagnostics. More importantly, these techniques reduce hallucinations, improve reliability, and offer insights into how AI reaches its conclusions, making them far more usable in real-world decision-making.
Test-Time Compute (TTC) is a key concept in Reasoning Models, allowing AI more time to think and refine its reasoning during inference. Instead of relying solely on model size, TTC scales performance by optimizing processing depth, enabling more accurate and reliable outputs in complex problem-solving.
Chain of Thought refers to a prompting technique that aims to guide the LLM to the correct answer for more complicated questions. What this demonstrates is that by providing the intermediate steps involved in reasoning towards an answer, we can greatly improve the outcome of our current state of the art language models in a variety of domains.
This procedure is quite familiar as a teaching method. Where the instructions are leading the student towards a completed answer and thereby teaching the steps along the way.
What is quite profound in the results is that LLMs not only improve their accuracy but also exhibit a form of structured reasoning that mimics human-like logical deduction. By explicitly guiding the model through intermediate steps, we create a scaffolded approach that allows it to process complex queries more effectively.
This structured reasoning is particularly useful in areas such as mathematical problem-solving, multi-hop reasoning, commonsense inference, and even legal or medical applications where a direct response might be insufficient. Instead of relying on intuition or heuristics alone, the model follows a step-by-step breakdown, reducing errors and increasing transparency in its responses.
Another significant advantage of Chain of Thought prompting is that it makes the model’s decision-making process interpretable. Instead of treating AI responses as a “black box,” we can now see how the model arrives at its conclusions. This is crucial in high-stakes applications where understanding the reasoning behind an answer is just as important as the answer itself.
Furthermore, research has shown that even smaller models can outperform larger ones when using CoT prompting, highlighting the power of structured reasoning over sheer scale. This suggests that refining prompting techniques could be just as impactful as increasing model size, leading to more efficient and capable AI systems.
Ultimately, Chain of Thought prompting represents a fundamental shift in how we interact with LLMs. It moves us away from expecting instant, unexplained answers toward a more transparent, logical, and pedagogical approach to AI reasoning—one that aligns with human cognitive processes and enhances our ability to trust and utilize AI effectively.
With the recent release of Meta's LLAMA3 models the future of AI seems to be one of abundant intelligence. Around a year ago there were some fears of all the benefits of AI accruing to a small portion of the population, but this seems less likely as the cost to running the applications are dropping dramatically.
What this means for society is still a bit unknown. Being that AI is a double edged sword, as are all new tools, it could be used to leverage the decency in humanity, or alternatively it could magnify the evil capacity of our species.
The crossroads we face is like none other. Some comparisons could be made to the power of nuclear energy but AI is far more permeable to all aspects of society and therefore the outcomes are much more variable.
Our best hope would be that with the additional intelligence available to us, humanity will add to our wells of wisdom. This has not necessarily been the case in the recent history with the explosion of information and data, but perhaps the missing ingredient has always been abundant intelligence.
It's becoming increasingly clear that our personal data holds much more value than many of us realized when we agreed to the terms of service that granted others access to this information. This data plays a crucial role in the development of artificial intelligence, benefiting large corporations financially.
As we move into 2024, the quality of internet data raises some concerns. The proliferation of bots and AI-generated content appears to dilute the overall quality of new information online.
This situation poses important questions: Is there an economic model that can support the creation of high-quality data for training future AI technologies? How does the quality of training data impact the performance of these models? Moreover, could high-quality data unlock new emergent capabilities in AI that we have yet to discover?
Bringing together a diverse group of individuals to collaborate on a project can be challenging, yet it's certainly achievable. The key lies in identifying the right incentives and establishing effective rules and governance structures.
In this context, blockchain technology offers promising solutions. It's well-suited for developing the protocols and contracts that can oversee data usage, ensuring that benefits are fairly distributed among all contributors.
I look forward to exploring this concept in greater detail in upcoming blog posts.
We live in an interesting time. A time when machines are beginning to express ideas, and simultaneously human beings are becoming more attached to "ideas" as a source of our identity. This seems to be a contradiction. If an idea is something that can be grasped, pulled from the ether of collective consciousness, and articulated by circuits and algorithms, then what does it say about the nature of our thoughts and beliefs? Are they truly ours, or are they just reflections of a larger, shared pool of human experience?
As technology advances, it blurs the lines between human originality and artificial intelligence's mimicry. We often pride ourselves on our unique ideas, believing them to be the essence of our individuality. However, the emerging reality suggests ideas, much like the words we use to express them, are communal. Born from a collective history of human thought, shaped and reshaped by culture, language, and shared experiences.
This realization invites us to rethink our relationship with our beliefs and opinions. Rather than tightly clinging to them as defining aspects of our identity, we might benefit from approaching them with a sense of fluidity and openness. Embracing the notion that our ideas are not entirely our own could foster greater empathy and understanding. It encourages us to listen more and assume less, to engage in dialogues not as combatants defending our intellectual territory but as explorers in a vast landscape of human thought.
In this new era, where ideas are as much a product of silicon as they are of neurons, it's perhaps more important than ever to recognize the shared nature of our thoughts. Detachment from the notion of ideas as personal property could lead us towards a more collaborative, tolerant, and innovative society. It challenges us to find our identity not in the rigidity of our beliefs, but in the richness of our shared human experience and our capacity to grow and change.
With the fast adoption of LLMs and other Generative AI technologies, we are seeing a massive increase in the amount of synthetic data being produced.
Synthetic data is data which is generated by an algorithm or program, vs real world events.
To put this in context, AI image generators have created over 15 billion images in just over a year, surpassing the first 150 years of photographically produced images.
This presents us with a deep moral and philosophical question. But also provides us with immediate opportunities
For example, with the use of LLMs many small businesses can produce vast amounts of synthetic data, for testing and other purposes. High quality test data can reduce development time and help shrink the development life cycle.
On the other hand, unethical businesses might use this data to inflate their customer base or other KPI metrics. As with any new tool or technology we will have immense opportunity, for good and for bad.
The much bigger question of "What is real?" has reached a new intersection.
― Morpheus
It seems like just in the past year we have entered a new era of opportunity and possibilities. Tasks and ideas that were previously impossible are today within reach. The new paradigms of AI permeate all industries and can touch upon virtually any task that we can think of.
Why then are there so many doom and gloom scenarios? From the loss of jobs to loss of human autonomy. It seems like there is a natural inclination to imagine and fear the worst.
Here I will try my best to imagine some of the best, most realistic, and near outcomes that the new AI tools will bring to pass.
― Kurt Vonnegut, Mother Night
Just in the last year, the thought of every student having a personalized education plan that meets their own needs and unique strengths has gone from a utopian dream, to a near term reality.
This will have a tremendously positive impact on income inequality and all types of societal benefits that are hard to imagine.
One of the earliest wow moments in the inflection point of generative AI in 2022 was the release of Stable Diffusion and other text to image tools.
This was followed quickly by calls from many artists to ban AI Art as it was theft and otherwise not real art.
As time passes we see that the real strength of generative AI is in empowering artists with tools that amplify their imaginative capacity and ability to create art.
A painter is now a creator of worlds.
A major concern with LLMs and AI Image generators is the ease of which they can produce plausible, yet false information and narratives.
This could lead to an explosion of fake and false news stories.
However an outcome of this could be an increase in skepticism and a real effort to find ways to improve good understanding.
The solution to bad information is good information. The challenge in recent years has been the ease of which it is possible to create and propagate false and inflammatory content has not been met with an equal ability to counter with good information at the same speed and scale.
AI systems can quickly counteract bad actors. They can also help to expand and explore the conversation beyond soundbites and clickbait.
This all requires us to reverse the course of our ever dwindling attention spans but the tools are now at our disposal for a proper defense of our information ecosystems.