08-07-2024 | Breyden Taylor

Deep Dive into AI: From Tokenization to Perception

I will reassure you that I am responsible for changing the tag below and I'm aware of the meaning of "practical".

Turns out that today, "theoretical_ai" is an inaccurate descriptor. I began pursuit of "perspective" in early 2023.

"top_tag": "practical_ai"

And while you're asking, I'll further say:

"it once was...

... a theory, that is."

-breyden 08-2024

Deep Dive into AI: From Tokenization to Perception

Introduction: Unpacking AI with Assistable

This group is a diverse mix of entrepreneurs, all leveraging AI in different ways. Some are seasoned veterans in sales and marketing, diving into AI for the first time. Others are AI experts, mastering the business side of things. And for some, it's all just plain fun.

But regardless of where you stand on the spectrum, we're all standing on the cutting edge of AI together.

For me, the idea of being on the cutting edge of anything felt like a distant dream—until AI started democratizing everything. AI doesn't just open doors; it tears down roadblocks. It's helping turn lofty dreams into real, tangible goals.

Why AI Matters

AI has this unique ability to shrink the distance between where you are now and where you want to be. It lets you see the big picture, bypass roadblocks, and zero in on your end goals with courage. It's giving synergy new meaning—real synergy, not just a LinkedIn buzzword. We've redefined "bespoke" solutions, and now, we're redefining collaboration.

AI isn't just helping us imagine new possibilities—it's helping us create them. And as we step into 2024, we're finding that even the meaning of "bespoke" has changed. It's deeper, heavier, more nuanced. We can now calculate how something like "bespoke" evolves over time, thanks to machine learning.

Why You Should Care About the Technical Side

I get it. Not everyone is interested in the technical underpinnings of AI. For many of you, the goal is simple: tangible results—better customer interactions, more efficiency, and streamlined operations. And that's totally valid. AI is a tool that can deliver those results.

But there's an advantage to going a little deeper. A solid understanding of how AI works can offer three key benefits:

Troubleshooting: When something goes wrong, understanding the principles behind AI can make it easier to diagnose and fix the issue, saving time and resources.
Explaining Value: Whether you're pitching to clients or investors, being able to articulate the value of AI-driven solutions with technical precision will make your case more compelling.
Credibility: In a world where everyone is jumping on the AI bandwagon, demonstrating a deep, well-informed grasp of the technology can set you apart.

Now, let's dive into the technical details behind some of the AI systems you're interacting with every day. I pulled a transcript from an assistable.ai Town Hall meeting and we'll be breaking it down to show some foundational concepts in artificial intelligence.

Tokenization: Breaking Down Language for AI

Tokenization is the process of converting text into smaller units called tokens, which can be individual words, subwords, or even characters. This is crucial because natural language often carries meaning at multiple levels—breaking text down into tokens allows the AI model to process it more effectively.

However, tokenization is not just about chopping up sentences. It's about striking the right balance. Breaking text into words may miss subtle nuances, while breaking it into individual letters creates too much noise. Tokens offer a "happy medium" where enough context is preserved for the AI to process high-dimensional semantic meaning.

Why It Matters:

Preprocessing Power: Tokenization acts as a preprocessing step, stripping away unnecessary noise—like those pesky prepositions and articles—so the AI can focus on what's meaningful. Think of it like filtering out the static on a radio station until you get a clear signal.

For example, after applying tokenization and filtering out meaningless words using a custom stopwords list, we were left with what I like to call "tasty_tokens"—a refined dataset of highly relevant, semantically rich content representative of the meaning from the assistable town hall transcript.

Contextual Understanding: Tokens allow AI to better grasp the layers of meaning within human language, replicating the intricacies of communication.

Application: When working with your own datasets, use tokenization to break down complex text. Try different levels of granularity (words, subwords, etc.) and experiment with custom stopwords lists to ensure you're filtering out noise effectively.

tools

to tokenize...

Use libraries like nltk (Natural Language Toolkit)or spaCy to quickly implement tokenization try OpenAI's tokenizer

Sentiment Analysis

Decoding Human Emotion in Text

Sentiment Analysis is another cornerstone of Natural Language Processing (NLP). This process involves algorithms that determine whether a piece of text carries a positive, negative, or neutral sentiment. Originally, sentiment models were basic classifiers—labeling text as good or bad. Modern algorithms, however, can assess degrees of sentiment and express these as numerical values.

Evolution of Sentiment Analysis

Checkout the sentiment analysis of the assistable town hall transcript 📊 👀


`Why it matters`	`Why It's Powerful`
Classifiers to Calculators: Early sentiment models were rigid, capable of labeling but not much else. Today's algorithms are far more nuanced, quantifying sentiment as a gradient rather than a binary.	Dynamic Relativity: Much like how dark and light are co-dependent concepts, sentiment operates on a spectrum. Sentiment analysis assigns relative values to text, creating a dynamic relational system where the meaning is contextual, fluid, and driven by surrounding information.
Beyond Numbers: Sentiment analysis blurred the lines between different data types—Boolean, text, and numbers—allowing NLP engineers to perform mathematical operations on non-numerical data. This has been a game changer for systems that rely on relative dependencies—where the state of one element is defined by the relationships it has with others.	Practical Application: Whether it's analyzing customer feedback, gauging social media reactions, or even powering recommendation systems, sentiment analysis helps businesses and developers get a clearer understanding of how people feel about a product or service.

Assistable Town Hall Sentiment Over Line Number

Though sentiment analysis has come a long way, it's not perfect.

Human sentiment is subjective, and we lack a universal standard for interpreting it.

So while AI can approximate sentiment, there's still a gap between its capabilities and human emotional nuance. But as we continue refining AI models, I believe we'll eventually close that gap, mapping out the missing pieces between human and machine interpretation of emotion.

Actionable Takeaway

Use sentiment analysis to monitor customer feedback, product reviews, or employee engagement. Track changes in sentiment over time to identify trends and take corrective action when negative sentiment spikes. In meeting transcripts, gauge buy-in and get user journey insights to assist with feature roadmapping and preemptive support.

Tool Suggestion: Platforms like TextBlob or VADER can easily perform sentiment analysis. For more complex sentiment tracking, you can integrate these into a dashboard for ongoing monitoring.

further exploration

Thematic Categorization: Extracting Meaning from Chaos

In any large dataset, identifying themes or topics is crucial for organization and insight. This is where thematic categorization comes into play. By categorizing text into specific themes, we can transform an otherwise unstructured dataset into something that can be easily analyzed and understood.

Why It Matters:

Organizing Unstructured Data: Historically, data had to fit into neat structures (think: library catalogs or the Dewey Decimal system). But with thematic categorization, AI can now handle unstructured data, making it accessible without forcing it into rigid formats.
Efficiency: Combining structured and unstructured data formats is not only possible but powerful. This flexibility lets AI extract meaning regardless of how the data is organized. Instead of manually sorting through endless documents or conversations, thematic categorization allows the AI to automatically group and prioritize information based on relevant themes.

For example, in analyzing our transcript, themes like AI, Support, and Sales emerged as dominant. This kind of categorization provided insights into the main focus of the discussions, helping us understand what topics garnered the most attention. Thematic categorization can also be a precursor to more advanced techniques like semantic clustering, which groups text by meaning rather than just keywords.

Actionable Takeaway:

Application: Categorize your business communications (e.g., emails, customer service chats, or product feedback) to uncover the most talked-about themes. This can help prioritize areas of improvement or marketing strategies.
Tool Suggestion: Use topic modeling algorithms like Latent Dirichlet Allocation (LDA) in Python to categorize themes. Tools like gensim can automate this process and help you discover hidden themes in your data.

Contextual Analysis: Understanding Words in Their Surroundings

While tokenization breaks down text into smaller units, contextual analysis takes a step back to examine how those units fit together. The meaning of a word often depends heavily on its context, and contextual analysis helps AI models generate more accurate, relevant responses by looking at the position and relationships of words within a text.

Why It Matters:

Accuracy: Contextual analysis prevents AI from taking words at face value. For instance, consider the phrase "I'm feeling blue." Without context, an AI might interpret "blue" as a color. But contextual analysis reveals that in this case, "blue" is a metaphor for sadness.
Dynamic Understanding: Context doesn't just affect individual words; it changes the meaning of entire sentences. Contextual analysis is one of the reasons that models like GPT-4 can generate more human-like responses—they don't just understand the words; they understand how those words interact.

By examining the relationships between tokens, the AI can disambiguate meaning and adapt its responses based on the surrounding text. It's what makes chatbots more conversational, allowing them to "remember" the context of a conversation and maintain relevance.

Actionable Takeaway:

Application: When building AI-driven applications like chatbots or recommendation engines, leverage contextual analysis to improve the quality of responses. Ensure your model understands the context of conversations to provide accurate, human-like answers.
Tool Suggestion: Utilize transformer-based models like BERT or GPT for applications where contextual understanding is key. Pre-trained models from Hugging Face can be fine-tuned to suit your specific needs.

Looking Forward

with Eyes on "Perception"

As we move beyond the technicalities of tokenization, sentiment analysis, thematic categorization, and contextual analysis, there's a larger concept looming on the horizon that could redefine how we interact with AI: Perception.

In human terms, perception isn't just about seeing or hearing—it's about interpreting and making sense of the world around us. It's the ability to synthesize multiple inputs, process them in context, and derive meaning that goes beyond raw data. As AI systems evolve, building models capable of not just understanding language, but perceiving it in a more dynamic, human-like way, will be key to unlocking their true potential.

The Role of Perception in AI:

Contextual Depth: Perception goes beyond surface-level understanding and deepens AI's ability to grasp nuance and intent. Imagine an AI that doesn't just analyze customer feedback for sentiment, but truly perceives the emotional subtext behind it, offering more empathetic and precise responses.
Adaptive Learning: AI systems with perceptual capabilities can better adapt to changing environments. Whether it's adjusting to new data inputs, or reacting to shifts in a conversation, AI perception allows for real-time adaptability, improving both user interaction and decision-making processes.
Collaboration & Interaction: As we saw with contextual analysis, AI today can understand and respond to isolated queries, but perception opens the door for more sophisticated collaboration. AI with perception could actively engage with users, evolving with their inputs and assisting in a way that feels more like true partnership than just command-response.

Perception as a Static Variable:

A key part of the journey forward is understanding how to treat perception as a static global variable. Much like time in machine learning models can be treated as static, perception allows AI to anchor its understanding in a consistent framework—while still being flexible enough to shift dynamically based on real-time data.

By defining perception as a constant, we give AI a stable reference point. But we can also allow the system to stretch the boundaries of that perception based on interaction with users, other models, or data environments. It's akin to how humans experience a shared reality, yet have individual interpretations based on their unique perspectives.

What's Next for AI and Perception?

Looking ahead, the most exciting developments in AI will likely involve perception as a dynamic yet anchored concept. Here's what to expect:

Perception-Driven Agents: Imagine AI models designed to not only respond to queries but to develop an understanding of their environment, interacting with it in a more context-aware, perceptual manner. This could revolutionize customer service, personal assistants, and even creative fields like content generation.
Perceptual Awareness in Complex Systems: Beyond individual interactions, perception will be vital in managing multi-layered, complex systems where variables change constantly. Think of AI systems in autonomous driving, healthcare diagnostics, or financial modeling, where a deeper level of perceptual awareness could lead to better decisions, faster reactions, and more robust predictions.
The Human-AI Hybrid: Perception will bring us closer to a future where humans and AI work together seamlessly, with the AI understanding not just the literal meaning of what we say, but the nuances, intentions, and emotions behind it. This collaborative perception will make AI more intuitive, more reliable, and more aligned with human needs.

Conclusion: Embracing the Perception Horizon with Sedna and Homeskillet

Perception is at the forefront of AI's next evolution. The Sedna/Homeskillet framework exemplifies this by leveraging dual-agent systems where perception acts as a dynamic variable. By mirroring human brain lateralization—Sedna managing microservices and simulations while Homeskillet oversees task execution—this approach elevates AI beyond conventional abstractions, allowing for real-time adaptability and parallel processing.

This dual-agent model, based on McGilchrist's brain lateralization theory, employs Sedna to manage microservices and simulations while Homeskillet oversees task execution. The framework mirrors human cognitive processes, creating a real-time, adaptable system capable of parallel processing. By moving beyond traditional abstractions, it enables AI to interact more holistically with its environment, anticipating and responding to changes in a human-like manner.

As we continue to develop and refine these perception-driven AI systems, we're not just creating more efficient tools—we're building collaborators that can truly understand and engage with the complexities of human thought and communication. The future of AI isn't just about better algorithms or faster processors—it's about teaching AI to see the world in the same complex, multifaceted ways we do. And with frameworks like Sedna/Homeskillet, that future is already taking shape.

Read more about this in the Sedna/Homeskillet framework, a model already shaping the future of AI interaction and decision-making.

visualize

assistable town hall transcript