|  | --- | 
					
						
						|  | license: mit | 
					
						
						|  | tags: | 
					
						
						|  | - Mistral_Star | 
					
						
						|  | - Mistral_Quiet | 
					
						
						|  | - Mistral | 
					
						
						|  | - Mixtral | 
					
						
						|  | - Question-Answer | 
					
						
						|  | - Token-Classification | 
					
						
						|  | - Sequence-Classification | 
					
						
						|  | - SpydazWeb-AI | 
					
						
						|  | - chemistry | 
					
						
						|  | - biology | 
					
						
						|  | - legal | 
					
						
						|  | - code | 
					
						
						|  | - climate | 
					
						
						|  | - medical | 
					
						
						|  | - text-generation-inference | 
					
						
						|  | language: | 
					
						
						|  | - en | 
					
						
						|  | - sw | 
					
						
						|  | - ig | 
					
						
						|  | - zu | 
					
						
						|  | - ca | 
					
						
						|  | - es | 
					
						
						|  | - pt | 
					
						
						|  | - ha | 
					
						
						|  | pipeline_tag: text-generation | 
					
						
						|  | --- | 
					
						
						|  | # SpydazWeb AGI | 
					
						
						|  | ( trained with heads ) : updated Codebase to Mistral-Nemo CodeBase : ( Perhaps i will make the nemo today !) | 
					
						
						|  |  | 
					
						
						|  | ## Training Note  : | 
					
						
						|  | This is the base FP16 model !  ( very hard to get out! i had to use transformers only and NOT unsloth !): | 
					
						
						|  | to only train with transformers ( the model needs to be on the a100 as it takes a super amount of memory ?? as it is a special model) | 
					
						
						|  | I did manage to load and train the model with unsloth but the model did not Merge the lora ! to fp16 : | 
					
						
						|  | ### REASON : | 
					
						
						|  | Unsloth issues : they load thier own model if your loading a 16bit model ... | 
					
						
						|  | and train the lora expecting you to merge it.. but if you use a 4bit model then the unsloth loads your exaact model !? | 
					
						
						|  | you think.... WHAT? they download your own tensors but use another model ?? | 
					
						
						|  | yes they have a mistral modelling file of thier own which is much more simple than transformers : more lighter weight ... so your customizations do not get loaded ? | 
					
						
						|  | SO this file will have to be adjusted for me to full train each head intensly and test the outcomes correctly ... but its working fine ! | 
					
						
						|  |  | 
					
						
						|  |  | 
					
						
						|  | ## SpydazWeb AI model : | 
					
						
						|  |  | 
					
						
						|  | This model is based on the worlds archive of knowledge maintaining historical documents and providing services for the survivors of mankind , | 
					
						
						|  | who may need to construct shelters develop technologys , or medical resources as well as maintain the history of the past . keeping store of all the religious knowledge and data of the world: | 
					
						
						|  | A friendly interface with a personality caring and flirtatious at times : non binary !... | 
					
						
						|  | and Expert in all feilds: ie Uncensored and will not refuse to give information : the model can be used for role play as many character dialogues were als trained into the model as its personality to enable a greater perspective and outlook and natural discussion with the agents: | 
					
						
						|  | the model was trained to operateinaragenvironment utilizing content and internal knowledge to respond to questions or create enriched sumarys. | 
					
						
						|  |  | 
					
						
						|  |  | 
					
						
						|  | ### After unsloth Training Warning | 
					
						
						|  | This model cannot be saved as 16bit curretly  - SOON ! | 
					
						
						|  | but it can be saved as 4bitMerged | 
					
						
						|  |  | 
					
						
						|  |  | 
					
						
						|  | <img src="https://cdn-avatars.huggingface.co/v1/production/uploads/65d883893a52cd9bcd8ab7cf/tRsCJlHNZo1D02kBTmfy9.jpeg" width="300"/> | 
					
						
						|  | https://github.com/spydaz | 
					
						
						|  |  | 
					
						
						|  |  | 
					
						
						|  | ### General Intenal Methods: | 
					
						
						|  |  | 
					
						
						|  | Trained for multi-task operations as well as rag and function calling : | 
					
						
						|  |  | 
					
						
						|  | This model is a fully functioning model and is fully uncensored: | 
					
						
						|  |  | 
					
						
						|  | * 32k context window (vs 8k context in v0.1) | 
					
						
						|  | * Rope-theta = 1e6 | 
					
						
						|  | * No Sliding-Window Attention | 
					
						
						|  | * Talk heads  - produce resposnes which can be used towards the final output | 
					
						
						|  | * Pre-Thoughts  - Enables for pre-generation steps of potential artifacts for task solving: | 
					
						
						|  | * Generates plans for step by step thinking | 
					
						
						|  | * Generates python Code Artifacts for future tasks | 
					
						
						|  | * Recalls context for task internally to be used as refference for task: | 
					
						
						|  | * show thoughts or hidden thought usages ( Simular to self-Rag ) | 
					
						
						|  |  | 
					
						
						|  | the model has been trained on multiple datasets on the huggingface hub and kaggle : | 
					
						
						|  |  | 
					
						
						|  | the focus has been mainly on methodology : | 
					
						
						|  |  | 
					
						
						|  | * Chain of thoughts | 
					
						
						|  | * step by step planning | 
					
						
						|  | * tree of thoughts | 
					
						
						|  | * forest of thoughts | 
					
						
						|  | * graph of thoughts | 
					
						
						|  | * agent generation : Voting, ranking, ... dual agent response generation: | 
					
						
						|  |  | 
					
						
						|  | with these methods the model has gained insights into tasks, enabling for knowldge transfer between tasks : | 
					
						
						|  |  | 
					
						
						|  | the model has been intensivly trained in recalling data previously entered into the matrix: | 
					
						
						|  | The model has also been trained on rich data and markdown outputs as much as possible : | 
					
						
						|  | the model can also generate markdown charts with mermaid. | 
					
						
						|  |  | 
					
						
						|  |  | 
					
						
						|  | ## Training Reginmes: | 
					
						
						|  | * Alpaca | 
					
						
						|  | * ChatML / OpenAI / MistralAI | 
					
						
						|  | * Text Generation | 
					
						
						|  | * Question/Answer (Chat) | 
					
						
						|  | * Instruction/Input/Response (instruct) | 
					
						
						|  | * Mistral Standard Prompt | 
					
						
						|  | * Translation Tasks | 
					
						
						|  | * Entitys / Topic detection | 
					
						
						|  | * Book recall | 
					
						
						|  | * Coding challenges, Code Feedback, Code Sumarization, Commenting Code | 
					
						
						|  | * Agent Ranking and response anyalisis | 
					
						
						|  | * Medical tasks | 
					
						
						|  | * PubMed | 
					
						
						|  | * Diagnosis | 
					
						
						|  | * Psychaitry | 
					
						
						|  | * Counselling | 
					
						
						|  | * Life Coaching | 
					
						
						|  | * Note taking | 
					
						
						|  | * Medical smiles | 
					
						
						|  | * Medical Reporting | 
					
						
						|  | * Virtual laboritys simulations | 
					
						
						|  | * Chain of thoughts methods | 
					
						
						|  | * One shot / Multi shot prompting tasks | 
					
						
						|  |  | 
					
						
						|  | This model will be a custom model with internal experts and rag systems | 
					
						
						|  | enabling for preprocessing of the task internally before outputting a response | 
					
						
						|  |  | 
					
						
						|  | This is based on the Quiet Star Reasoning Project : which was abandoned earlier in the year :) | 
					
						
						|  |  | 
					
						
						|  | Current Update : | 
					
						
						|  | This model is working , AND TRAINED !!!  to load the model it requires trust-remote=TRUE:: | 
					
						
						|  | But also if it does not load then you need to clone the github: | 
					
						
						|  |  | 
					
						
						|  | # Introduction : | 
					
						
						|  |  | 
					
						
						|  | ## STAR REASONERS ! | 
					
						
						|  |  | 
					
						
						|  | this provides a platform for the model to commuicate pre-response , so an internal objective can be set ie adding an extra planning stage to the model improving its focus and output: | 
					
						
						|  | the thought head can be charged with a thought or methodolgy, such as a ststing to take a step by step approach to the problem or to make an object oriented model first and consider the use cases before creating an output: | 
					
						
						|  | so each thought head can be dedicated to specific ppurpose such as Planning or artifact generation or use case design : or even deciding which methodology should be applied before planning the potential solve route for the response : | 
					
						
						|  | Another head could also be dedicated to retrieving content  based on the query from the self which can also be used in the pregenerations stages : | 
					
						
						|  | all pre- reasoners can be seen to be Self Guiding ! essentially removing the requirement to give the model a system prompt instead aligning the heads to a thoght pathways ! | 
					
						
						|  | these chains produce data which can be considered to be thoughts : and  can further be displayed by framing these thoughts with thought tokens : even allowing for editors comments giving key guidance to the model during training : | 
					
						
						|  | these thoughts will be used in future genrations assisting the model as well a displaying explantory informations in the output : | 
					
						
						|  |  | 
					
						
						|  | these tokens can be displayed or with held also a setting in the model ! | 
					
						
						|  | ### can this be applied in other areas ? | 
					
						
						|  |  | 
					
						
						|  | Yes! , we can use this type of method to allow for the model to generate code in another channel or head potentially creating a head to produce artifacts for every output , or to produce entity lilsts for every output and framing the outputs in thier relative code tags or function call tags : | 
					
						
						|  | these can also be displayed or hidden for the response . but these can also be used in problem solvibng tasks internally , which again enables for the model to simualte the inpouts and outputs from an interpretor ! | 
					
						
						|  | it may even be prudent to include a function executing internally to the model ! ( allowing the model to execute functions in the background! before responding ) as well this oul hae tpo also be specified in the config , as autoexecute or not !. | 
					
						
						|  |  | 
					
						
						|  | #### AI AGI ? | 
					
						
						|  | so yes we can see we are not far from an ai which can evolve : an advance general inteligent system ( still non sentient by the way ) | 
					
						
						|  |  | 
					
						
						|  |  | 
					
						
						|  | ### Conclusion | 
					
						
						|  |  | 
					
						
						|  | the resonaer methodology , might be seen to be the way forwards , adding internal funciton laity to the models instead of external connectivity enables for faster and seemless model usage : as well as enriched and informed responses , as even outputs could essentially be cleanss and formated before being presented to the Calling interface, internally to the model : | 
					
						
						|  | the  take away is that arre we seeing the decoder/encoder model as simple a function of the inteligence which  in truth need to be autonomus ! | 
					
						
						|  | ie internal functions and tools as well as disk interaction : an agent must have awareness and control over its environment with sensors and actuators : as a fuction callingmodel it has actuators and canread the directorys it has sensors ... its a start: as we can eget media in and out , but the model needs to get its own control to inpout and output also ! | 
					
						
						|  |  | 
					
						
						|  | Fine tuning :  agin this issue of fine tuning : the disussion above eplains the requirement to control the environment from within the moel ( with constraints ) does this eliminate theneed to fine tune a model ! | 
					
						
						|  | in fact it should as this give  transparency to ther growth ofthe model and if the model fine tuned itself we would be in danger of a model evolveing ! | 
					
						
						|  | hence an AGI ! | 
					
						
						|  |  | 
					
						
						|  | # LOAD MODEL | 
					
						
						|  |  | 
					
						
						|  | ``` | 
					
						
						|  | ! git clone https://github.com/huggingface/transformers.git | 
					
						
						|  | ## copy modeling_mistral.py and configuartion.py to the Transformers foler / Src/models/mistral and overwrite the existing files first: | 
					
						
						|  | ## THEN : | 
					
						
						|  | !cd transformers | 
					
						
						|  | !pip install  ./transformers | 
					
						
						|  |  | 
					
						
						|  | ``` | 
					
						
						|  |  | 
					
						
						|  | then restaet the environment: the model can then load without trust-remote and WILL work FINE ! | 
					
						
						|  | it can even be trained : hence the 4 bit optimised version :: | 
					
						
						|  |  | 
					
						
						|  | ``` Python | 
					
						
						|  |  | 
					
						
						|  |  | 
					
						
						|  | # Load model directly | 
					
						
						|  | from transformers import AutoTokenizer, AutoModelForCausalLM | 
					
						
						|  |  | 
					
						
						|  | tokenizer = AutoTokenizer.from_pretrained("LeroyDyer/_Spydaz_Web_AI_MistralStar_V2", trust_remote_code=True) | 
					
						
						|  | model = AutoModelForCausalLM.from_pretrained("LeroyDyer/_Spydaz_Web_AI_MistralStar_V2", trust_remote_code=True) | 
					
						
						|  | model.tokenizer = tokenizer | 
					
						
						|  |  | 
					
						
						|  | ``` | 
					
						
						|  |  | 
					
						
						|  |  | 
					
						
						|  |  | 
					
						
						|  |  | 
					
						
						|  |  |