The era of data/AI platform collaboration

As everyone, I am following up the gen AI competition race, watching curiously these leaderboards and benchmarks for evaluating generative AI model. This is methodical, good. Will that lead us toward a next level of AI within industries and companies?

First, for those interested, this article provides a comprehensive overview of the recent success of Meta llama 3 AI model and how it compares to others. Conveniently, it also includes definitions of the benchmarks used, whether it’s a college entrance exam, logic puzzle, or a biology, physics, or chemistry exam. You can find these definitions after some cool numbers of GPUs used and model parameters in the article.

Their conclusion is very clear. Following benchmark interpretation, the 8B model (lighter and easier to use for a normal people) is a grade F student while their 400B model (a behemoth requiring extensive RAM) is rather a good student with grade B. This makes them questioning if and how such LLMs would be used by companies.

I agree with their questioning. For now, yes. But in the future? This article from the Harvard Business Review proposes an interesting pathway on “data collaboration” (between companies) to build “better AI”.

Good, they start with a fair diagnostic: “not enough data”, “unstructured or of poor quality”, “representation of diverses perspectives”. Then I see it a bit confusing, talking about platforms, high-quality data sharing, or sharing algorithm only, supposed to solve the data privacy problem. No mention of Hugging Face‘s platform and leaderboard.

Nevertheless, the conclusion is nice, and this is where I want to go. There is balance between sharing high-quality data, and the importance of customization.

“By embracing data collaborations, business leaders can safely access high-quality data, avoid legal issues, gain a diverse, pluralistic, and therefore more expansive view of the world, unlocking the full potential of fine-tuned models.”

“Customization is key to aligning these tools with an organization’s unique environment and requirements. “

“Organizations in the same industry can collaborate to tackle challenges from which the whole industry suffers; by pooling resources and knowledge, companies can collectively enhance AI models, leading to innovations and efficiencies that might not be achievable indepently. “

Ok, now my open-ended questions on this. Undoubtedly, AI will continue to observe significant development. And open-source collaboration will play a crucial role. But:

– All data issue to be solved with external data? Sometimes, yes. Always? Probably not. In many situations, quality data to represent the diverse perpectives simply do not exist.

– Impact on knowledge, data governance within companies? How feasible is it to define the limit of data/knowledge being transferred externally?

– How companies’s value will change? Which business model will evolve? Probably better to be a company with physical assets. The (successful) platforms enhancing interactions. yes. But others in the digital economy?

I hope global collaboration will work, and lead us toward a landscape where human agility and creativity, on top of their emotional intelligence, are the values of employees, then being companies’ greatest assets. To recall Yann LeCun:

AI will bring a new era of enlightenment, a renaissance to humanity.

I hope so! Looking forward to read any thought on this. On my side, I have more questions than conclusions.

Leave a Reply Cancel reply