Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
89
112
140
Feynman Innovations
ajibawa-2023
Follow
Yatharthyx's profile picture
Rushikesh123456789's profile picture
FusionCorp's profile picture
176 followers
Β·
23 following
AjinkyaBawase
AI & ML interests
LLM, RL, DL, ML, AGI. Developing LLMs (preferably fully fine tuned ) for various use cases.
Recent Activity
reacted
to
DmitryRyumin
's
post
with π₯
2 days ago
ππ€π New Research Alert - ICCV 2025 (Oral)! ππ€π π Title: Variance-based Pruning for Accelerating and Compressing Trained Networks π π Description: The one-shot pruning method efficiently compresses networks, reducing computation and memory usage while retaining almost full performance and requiring minimal fine-tuning. π₯ Authors: Uranik Berisha, Jens Mehnert, and Alexandru Paul Condurache π Conference: ICCV, 19 β 23 Oct, 2025 | Honolulu, Hawai'i, USA πΊπΈ π Paper: https://huggingface.co/papers/2507.12988 π ICCV-2023-25-Papers: https://github.com/DmitryRyumin/ICCV-2023-25-Papers π Added to the Efficient Learning Section: https://github.com/DmitryRyumin/ICCV-2023-25-Papers/blob/main/sections/2025/main/efficient-learning.md π More Papers: more cutting-edge research presented at other conferences in the https://huggingface.co/spaces/DmitryRyumin/NewEraAI-Papers curated by @DmitryRyumin π Keywords: #VarianceBasedPruning #NetworkCompression #ModelAcceleration #EfficientDeepLearning #VisionTransformers #AI #ICCV2025 #ResearchHighlight
reacted
to
onekq
's
post
with π
2 days ago
Context rot is such a catchy phrase, but the problem has been identified 2+ years ago, called attention decay. https://huggingface.co/papers/2307.03172 I spotted the same problem in coding tasks, and documented in my book (https://www.amazon.com/dp/9999331130). Why did this problem become hot again? This is because many of us thought the problem has been solved by long context models, which is not true. Here we were misled by benchmarks. Most long-context benchmarks build around the QA scenario, i.e. "finding needle in haystack". But in agentic scenarios, the model needs to find EVERYTHING in the haystack, and just can't afford enough attention for this challenge.
reacted
to
di-zhang-fdu
's
post
with π₯
2 days ago
The training dataset of ChemVLM is open-sourced now, have a check! https://huggingface.co/datasets/di-zhang-fdu/chemvlm-sft-datasets papers: https://huggingface.co/papers/2408.07246
View all activity
Organizations
ajibawa-2023
's models
32
Sort:Β Recently updated
ajibawa-2023/scarlett-13b
Text Generation
β’
Updated
Aug 16, 2023
β’
412
β’
3
ajibawa-2023/carl-llama-2-13b
Text Generation
β’
Updated
Aug 16, 2023
β’
411
β’
11
Previous
1
2
Next