John6666 (John Smith)

🎄 67,074 Qwen3-Coder OpenHands trajectories + 2 RFT checkpoints.

We release: 67,000+ trajectories from 3,800 resolved issues in 1,800+ Python repos.
About 3x more successful trajectories and 1.5x more repos than our previous dataset.
Trajectories are long: on average 64 turns, up to 100 turns and 131k context length.

> RFT on this data, SWE-bench Verified:
Qwen3-30B-Instruct: 25.7% → 50.3% Pass@1.
Qwen3-235B-Instruct: 46.2% → 61.7% Pass@1.
Also strong gains on SWE-rebench September.

> We also did massive evals.
We run OpenHands with 100 and 500 turns.
We compare models under both limits.
We run on SWE-bench Verified and several months of SWE-rebench.

!!! We also check tests written by the models.
We measure how often tests are correct.
We check how often the final patch passes its own tests.
This gives a pool of tests for verifiers and auto graders.

> Fully permissive licenses
Dataset and models: https://huggingface.co/collections/nebius/openhands-trajectories

Blog post: https://nebius.ai/blog/posts/openhands-trajectories-with-qwen3-instruct

1 reply

·

reacted to AbstractPhil's post with 👀 about 9 hours ago

Post

129

geofractal getting started guide available, bulk ablation for fusion, simple towers, oscillator capacity, and substructure systemic associative capacity.
Many formulas were tested, 92 tests for collectives, oscillation bulk experiments, and more. All of them either coalesce into the correct behavior or the failures are directly visible, which means the system is robust enough to declare some tools functionally valid but not scalable yet.

ai-crash course available;
https://github.com/AbstractEyes/geofractal/blob/main/ai_helpers/v101_claude_helpers.txt
Feed GPT, Claude, or Grokk and they will assist.

getting started guide;
https://github.com/AbstractEyes/geofractal/blob/main/src/geofractal/router/GETTING_STARTED.md

geofractal router architecture is in prototype phases;
https://github.com/AbstractEyes/geofractal

This is likely one of it's final growing phases before full production capacity is ramped up. The architecture is not for the novice, it's meant for experts to either get ideas, borrow code, utilize library capacity, or simply tell AI what to do. MOST files in current production have good descriptions for AI integration.

Transfer learning notebook available here;
https://github.com/AbstractEyes/geofractal/blob/main/src/geofractal/router/Router_Transfer_Learning-12_19_25.ipynb

Stress test and multiple diagnostics available here;
https://github.com/AbstractEyes/geofractal/blob/main/src/geofractal/router/components/diagnostics/

WideRouter compilation capacity available;
https://github.com/AbstractEyes/geofractal/blob/main/src/geofractal/router/wide_router.py

The wide router compiler organizes similar towers into stacked staged combinations before compiling with torch.compile. This is experimental, but has shown increased speed with multiple structures of wide models and will serve it's purpose in the future.

1 reply

·

reacted to TravisMuhlestein's post with 👀 about 9 hours ago

Post

107

From AI demos to production systems: what breaks when agents become autonomous?

A recurring lesson from production AI deployments is that most failures are system failures, not model failures.

As organizations move beyond pilots, challenges increasingly shift toward:

• Agent identity and permissioning
• Trust boundaries between agents and human operators
• Governance and auditability for autonomous actions
• Security treated as a first-class architectural constraint

This recent Fortune article highlights how enterprises are navigating that transition, including work with AWS’s AI Innovation Lab.

Open question for the community:
What architectural patterns or tooling are proving effective for managing identity, permissions, and safety in autonomous or semi-autonomous agent systems in production?

Context: https://fortune.com/2025/12/19/amazon-aws-innovation-lab-aiq/

reacted to ronantakizawa's post with 👍 about 9 hours ago

Post

647

Thank you @clem (Co-Founder & CEO of Hugging Face) for sharing my dataset on X / Twitter!

ronantakizawa/github-top-developers

#github #dataset

1 reply

·

John Smith PRO

AI & ML interests

Recent Activity

Organizations

Spaces for Image-to-Image / Video

Qwen Image Edit 2511 Fast

Vortex5/Moonlit-Umbra-12B

mradermacher/Moonlit-Umbra-12B-i1-GGUF

mradermacher/Moonlit-Umbra-12B-GGUF

Zaynoid/Qwen3-VL-8B-V5

mradermacher/Qwen3-VL-8B-V5-i1-GGUF

mradermacher/Qwen3-VL-8B-V5-GGUF

Zaynoid/Qwen3-VL-8B-V6

mradermacher/Qwen3-VL-8B-V6-i1-GGUF

mradermacher/Qwen3-VL-8B-V6-GGUF

Chinastark/ATK-3B

mradermacher/ATK-3B-GGUF

Spaces for Image-to-Image / Video

Qwen Image Layered

Openhands Trajectories

John Smith PRO

AI & ML interests

Recent Activity

Organizations

John6666's activity

Qwen Image Edit 2511 Fast

Qwen Image Layered