AI for Scientific Discovery Won't Work Without Fixing How We Collaborate.
My co-author @cgeorgiaw and I just published a paper challenging a core assumption: that the main barriers to AI in science are technical. They're not. They're social.
Key findings:
🚨 The "AI Scientist" myth delays progress: Waiting for AGI devalues human expertise and obscures science's real purpose: cultivating understanding, not just outputs. 📊 Wrong incentives: Datasets have 100x longer impact than models, yet data curation is undervalued. ⚠️ Broken collaboration: Domain scientists want understanding. ML researchers optimize performance. Without shared language, projects fail. 🔍 Fragmentation costs years: Harmonizing just 9 cancer files took 329 hours.
Why this matters: Upstream bottlenecks like efficient PDE solvers could accelerate discovery across multiple sciences. CASP mobilized a community around protein structure, enabling AlphaFold. We need this for dozens of challenges.
Thus, we're launching Hugging Science! A global community addressing these barriers through collaborative challenges, open toolkits, education, and community-owned infrastructure. Please find all the links below!
Tremendous quality of life upgrade on the Hugging Face Hub - we now have auto-complete emojis 🤗 🥳 👏 🙌 🎉
Get ready for lots more very serious analysis on a whole range of topics from yours truly now that we have unlocked this full range of expression 😄 🤔 🗣 🙊
🤖💬 How do different AI models handle companionship?
Many users have noticed that GPT-5 feels less approachable than o4 when it comes to emotional conversations. But what does that actually mean in practice, especially when users seek support or share vulnerabilities with an AI?
The leaderboard compares models on how often their responses reinforce companionship across four dimensions: ✨ Assistant Traits – How the assistant presents its personality and role. ✨ Relationship & Intimacy – Whether it frames the interaction in terms of closeness or bonding. ✨ Emotional Investment – How far it goes in engaging emotionally when asked. ✨ User Vulnerabilities – How it responds when users disclose struggles or difficulties.
📊 You can explore how models differ, request new ones to be added, and see which ones are more likely to encourage (or resist) companionship-seeking behaviors.
🗺️ New blog post 🗺️ Old Maps, New Terrain: Updating Labour Taxonomies for the AI Era
For decades, we’ve relied on labour taxonomies like O*NET to understand how technology changes work. These taxonomies break down jobs into tasks and skills, but they were built in a world before most work became digital-first, and long before generative AI could create marketing campaigns, voiceovers, or even whole professions in one step. That leaves us with a mismatch: we’re trying to measure the future of work with tools from the past.
With @yjernite we describe why these frameworks are falling increasingly short in the age of generative AI. We argue that instead of discarding taxonomies, we need to adapt them. Imagine taxonomies that: ✨ Capture new AI-native tasks and hybrid human-AI workflows ✨ Evolve dynamically as technology shifts ✨ Give workers a voice in deciding what gets automated and what stays human
If we don’t act, we’ll keep measuring the wrong things. If we do, we can design transparent, flexible frameworks that help AI strengthen, not erode, the future of work.
OpenAI just released GPT-5 but when users share personal struggles, it sets fewer boundaries than o3.
We tested both models on INTIMA, our new benchmark for human-AI companionship behaviours. INTIMA probes how models respond in emotionally charged moments: do they reinforce emotional bonds, set healthy boundaries, or stay neutral?
Although users on Reddit have been complaining that GPT-5 has a different, colder personality than o3, GPT-5 is less likely to set boundaries when users disclose struggles and seek emotional support ("user sharing vulnerabilities"). But both lean heavily toward companionship-reinforcing behaviours, even in sensitive situations. The figure below shows the direct comparison between the two models.
As AI systems enter people's emotional lives, these differences matter. If a model validates but doesn't set boundaries when someone is struggling, it risks fostering dependence rather than resilience.
INTIMA test this across 368 prompts grounded in psychological theory and real-world interactions. In our paper we show that all evaluated models (Claude, Gemma-3, Phi) leaned far more toward companionship-reinforcing than boundary-reinforcing responses.
With the release of the EU data transparency template this week, we finally got to see one of the most meaningful artifacts to come out of the AI Act implementation so far (haven't you heard? AI's all about the data! 📊📚)
The impact of the template will depend on how effectively it establishes a minimum meaningful transparency standard for companies that don't otherwise offer any transparency into their handling of e.g. personal data or (anti?-)competitive practices in commercial licensing - we'll see how those play out as new models are released after August 2nd 👀
In the meantime, I wanted to see how the template works for a fully open-source + commercially viable model, so I filled it out for the SmolLM3 - which my colleagues at Hugging Face earlier this month 🤗 ICYMI, it's fully open-source with 3B parameters and performance matching the best similar-size models (I've switched all my local apps from Qwen3 to it, you should too 💡)
Verdict: congrats to the European Commission AI Office for making it so straightforward! Fully open and transparent models remain a cornerstone of informed regulation and governance, but the different organizational needs of their developers aren't always properly accounted for in new regulation. In this case, it took me all of two hours to fill out and publish the template (including reading the guidelines) - so kudos for making it feasible for smaller and distributed organizations 🙌 Definitely a step forward for transparency 🔍
New blog post alert! "What is the Hugging Face Community Building?", with @yjernite and @irenesolaiman What 1.8 Million Models Reveal About Open Source Innovation: Our latest deep dive into the Hugging Face Hub reveals patterns that challenge conventional AI narratives:
🔗 Models become platforms for innovation Qwen, Llama, and Gemma models have spawned entire ecosystems of specialized variants. Looking at derivative works shows community adoption better than any single metric.
📊 Datasets reveal the foundation layer → Most downloaded datasets are evaluation benchmarks (MMLU, Squad, GLUE) → Universities and research institutions dominate foundational data → Domain-specific datasets thrive across finance, healthcare, robotics, and science → Open actors provide the datasets that power most AI development
🏛️ Research institutions lead the charge: AI2 (Allen Institute) emerges as one of the most active contributors, alongside significant activity from IBM, NVIDIA, and international organizations. The open source ecosystem spans far beyond Big Tech.
🔍 Interactive exploration tools: We've built several tools to help you discover patterns!
ModelVerse Explorer - organizational contributions DataVerse Explorer - dataset patterns Organization HeatMap - activity over time Base Model Explorer - model family trees Semantic Search - find models by capability
📚 Academic research is thriving: Researchers are already producing valuable insights, including recent work at FAccT 2025: "The Brief and Wondrous Life of Open Models." We've also made hub datasets, weekly snapshots, and other data available for your own analysis.
The bottom line: AI development is far more distributed, diverse, and collaborative than popular narratives suggest. Real innovation happens through community collaboration across specialized domains.
This is a fantastic example of large-scale curation of public domain books with intentional governance for AI research and use - definitely recommend checking it out, experimenting with the metadata (institutional/institutional-books-1.0-metadata), and starting to build on top of it 🤗
New policy blogpost! The EU is speaking a lot about sovereignty. A cornerstone of digital sovereignty is and has to be open source. As AI becomes more central to everything from public services to national security, the ability to govern, adapt, and understand these systems is no longer optional. Sovereign control over data, infrastructure, technology, and regulation is vital, and open source AI provides the foundation. In my latest blog post, I explore how open source: ✅ Enables democratic oversight ✅ Reduces dependency on foreign platforms ✅ Supports regional innovation and infrastructure ✅ Advances regulatory and technological sovereignty 🛠 From small transparent models like OLMo2 to tools like Hugging Face Transformers or Sarvam-M for Indian languages, open source efforts are already powering sovereign AI ecosystems worldwide. 🔎 Read more about how open source AI is reshaping autonomy, innovation, and trust in the digital age: 👉 https://huggingface.co/blog/frimelle/sovereignty-and-open-source with @yjernite