Feynman Innovations's picture

Feynman Innovations

ajibawa-2023

·

AjinkyaBawase

AI & ML interests

LLM, RL, DL, ML, AGI. Developing LLMs (preferably fully fine tuned ) for various use cases.

Recent Activity

reacted to DmitryRyumin's post with 🔥 2 days ago

🚀🤖🌟 New Research Alert - ICCV 2025 (Oral)! 🌟🤖🚀 📄 Title: Variance-based Pruning for Accelerating and Compressing Trained Networks 🔝 📝 Description: The one-shot pruning method efficiently compresses networks, reducing computation and memory usage while retaining almost full performance and requiring minimal fine-tuning. 👥 Authors: Uranik Berisha, Jens Mehnert, and Alexandru Paul Condurache 📅 Conference: ICCV, 19 – 23 Oct, 2025 | Honolulu, Hawai'i, USA 🇺🇸 📄 Paper: https://huggingface.co/papers/2507.12988 🚀 ICCV-2023-25-Papers: https://github.com/DmitryRyumin/ICCV-2023-25-Papers 🚀 Added to the Efficient Learning Section: https://github.com/DmitryRyumin/ICCV-2023-25-Papers/blob/main/sections/2025/main/efficient-learning.md 📚 More Papers: more cutting-edge research presented at other conferences in the https://huggingface.co/spaces/DmitryRyumin/NewEraAI-Papers curated by @DmitryRyumin 🔍 Keywords: #VarianceBasedPruning #NetworkCompression #ModelAcceleration #EfficientDeepLearning #VisionTransformers #AI #ICCV2025 #ResearchHighlight

reacted to onekq's post with 👍 2 days ago

Context rot is such a catchy phrase, but the problem has been identified 2+ years ago, called attention decay. https://huggingface.co/papers/2307.03172 I spotted the same problem in coding tasks, and documented in my book (https://www.amazon.com/dp/9999331130). Why did this problem become hot again? This is because many of us thought the problem has been solved by long context models, which is not true. Here we were misled by benchmarks. Most long-context benchmarks build around the QA scenario, i.e. "finding needle in haystack". But in agentic scenarios, the model needs to find EVERYTHING in the haystack, and just can't afford enough attention for this challenge.

reacted to di-zhang-fdu's post with 🔥 2 days ago

The training dataset of ChemVLM is open-sourced now, have a check! https://huggingface.co/datasets/di-zhang-fdu/chemvlm-sft-datasets papers: https://huggingface.co/papers/2408.07246

View all activity

Organizations

ajibawa-2023 's models 32

ajibawa-2023/scarlett-13b

Text Generation • Updated Aug 16, 2023 • 412 • 3

ajibawa-2023/carl-llama-2-13b

Text Generation • Updated Aug 16, 2023 • 411 • 11