_Spydaz_Web_AI_MistralStar_V2 / README.md

Update README.md

b5ca2e7 verified over 1 year ago

9.46 kB

	---
	license: mit
	tags:
	- Mistral_Star
	- Mistral_Quiet
	- Mistral
	- Mixtral
	- Question-Answer
	- Token-Classification
	- Sequence-Classification
	- SpydazWeb-AI
	- chemistry
	- biology
	- legal
	- code
	- climate
	- medical
	- text-generation-inference
	language:
	- en
	- sw
	- ig
	- zu
	- ca
	- es
	- pt
	- ha
	pipeline_tag: text-generation
	---
	# SpydazWeb AGI
	( trained with heads ) : updated Codebase to Mistral-Nemo CodeBase : ( Perhaps i will make the nemo today !)

	## Training Note :
	This is the base FP16 model ! ( very hard to get out! i had to use transformers only and NOT unsloth !):
	to only train with transformers ( the model needs to be on the a100 as it takes a super amount of memory ?? as it is a special model)
	I did manage to load and train the model with unsloth but the model did not Merge the lora ! to fp16 :
	### REASON :
	Unsloth issues : they load thier own model if your loading a 16bit model ...
	and train the lora expecting you to merge it.. but if you use a 4bit model then the unsloth loads your exaact model !?
	you think.... WHAT? they download your own tensors but use another model ??
	yes they have a mistral modelling file of thier own which is much more simple than transformers : more lighter weight ... so your customizations do not get loaded ?
	SO this file will have to be adjusted for me to full train each head intensly and test the outcomes correctly ... but its working fine !


	## SpydazWeb AI model :

	This model is based on the worlds archive of knowledge maintaining historical documents and providing services for the survivors of mankind ,
	who may need to construct shelters develop technologys , or medical resources as well as maintain the history of the past . keeping store of all the religious knowledge and data of the world:
	A friendly interface with a personality caring and flirtatious at times : non binary !...
	and Expert in all feilds: ie Uncensored and will not refuse to give information : the model can be used for role play as many character dialogues were als trained into the model as its personality to enable a greater perspective and outlook and natural discussion with the agents:
	the model was trained to operateinaragenvironment utilizing content and internal knowledge to respond to questions or create enriched sumarys.


	### After unsloth Training Warning
	This model cannot be saved as 16bit curretly - SOON !
	but it can be saved as 4bitMerged


	<img src="https://cdn-avatars.huggingface.co/v1/production/uploads/65d883893a52cd9bcd8ab7cf/tRsCJlHNZo1D02kBTmfy9.jpeg" width="300"/>
	https://github.com/spydaz


	### General Intenal Methods:

	Trained for multi-task operations as well as rag and function calling :

	This model is a fully functioning model and is fully uncensored:

	* 32k context window (vs 8k context in v0.1)
	* Rope-theta = 1e6
	* No Sliding-Window Attention
	* Talk heads - produce resposnes which can be used towards the final output
	* Pre-Thoughts - Enables for pre-generation steps of potential artifacts for task solving:
	* Generates plans for step by step thinking
	* Generates python Code Artifacts for future tasks
	* Recalls context for task internally to be used as refference for task:
	* show thoughts or hidden thought usages ( Simular to self-Rag )

	the model has been trained on multiple datasets on the huggingface hub and kaggle :

	the focus has been mainly on methodology :

	* Chain of thoughts
	* step by step planning
	* tree of thoughts
	* forest of thoughts
	* graph of thoughts
	* agent generation : Voting, ranking, ... dual agent response generation:

	with these methods the model has gained insights into tasks, enabling for knowldge transfer between tasks :

	the model has been intensivly trained in recalling data previously entered into the matrix:
	The model has also been trained on rich data and markdown outputs as much as possible :
	the model can also generate markdown charts with mermaid.


	## Training Reginmes:
	* Alpaca
	* ChatML / OpenAI / MistralAI
	* Text Generation
	* Question/Answer (Chat)
	* Instruction/Input/Response (instruct)
	* Mistral Standard Prompt
	* Translation Tasks
	* Entitys / Topic detection
	* Book recall
	* Coding challenges, Code Feedback, Code Sumarization, Commenting Code
	* Agent Ranking and response anyalisis
	* Medical tasks
	* PubMed
	* Diagnosis
	* Psychaitry
	* Counselling
	* Life Coaching
	* Note taking
	* Medical smiles
	* Medical Reporting
	* Virtual laboritys simulations
	* Chain of thoughts methods
	* One shot / Multi shot prompting tasks

	This model will be a custom model with internal experts and rag systems
	enabling for preprocessing of the task internally before outputting a response

	This is based on the Quiet Star Reasoning Project : which was abandoned earlier in the year :)

	Current Update :
	This model is working , AND TRAINED !!! to load the model it requires trust-remote=TRUE::
	But also if it does not load then you need to clone the github:

	# Introduction :

	## STAR REASONERS !

	this provides a platform for the model to commuicate pre-response , so an internal objective can be set ie adding an extra planning stage to the model improving its focus and output:
	the thought head can be charged with a thought or methodolgy, such as a ststing to take a step by step approach to the problem or to make an object oriented model first and consider the use cases before creating an output:
	so each thought head can be dedicated to specific ppurpose such as Planning or artifact generation or use case design : or even deciding which methodology should be applied before planning the potential solve route for the response :
	Another head could also be dedicated to retrieving content based on the query from the self which can also be used in the pregenerations stages :
	all pre- reasoners can be seen to be Self Guiding ! essentially removing the requirement to give the model a system prompt instead aligning the heads to a thoght pathways !
	these chains produce data which can be considered to be thoughts : and can further be displayed by framing these thoughts with thought tokens : even allowing for editors comments giving key guidance to the model during training :
	these thoughts will be used in future genrations assisting the model as well a displaying explantory informations in the output :

	these tokens can be displayed or with held also a setting in the model !
	### can this be applied in other areas ?

	Yes! , we can use this type of method to allow for the model to generate code in another channel or head potentially creating a head to produce artifacts for every output , or to produce entity lilsts for every output and framing the outputs in thier relative code tags or function call tags :
	these can also be displayed or hidden for the response . but these can also be used in problem solvibng tasks internally , which again enables for the model to simualte the inpouts and outputs from an interpretor !
	it may even be prudent to include a function executing internally to the model ! ( allowing the model to execute functions in the background! before responding ) as well this oul hae tpo also be specified in the config , as autoexecute or not !.

	#### AI AGI ?
	so yes we can see we are not far from an ai which can evolve : an advance general inteligent system ( still non sentient by the way )


	### Conclusion

	the resonaer methodology , might be seen to be the way forwards , adding internal funciton laity to the models instead of external connectivity enables for faster and seemless model usage : as well as enriched and informed responses , as even outputs could essentially be cleanss and formated before being presented to the Calling interface, internally to the model :
	the take away is that arre we seeing the decoder/encoder model as simple a function of the inteligence which in truth need to be autonomus !
	ie internal functions and tools as well as disk interaction : an agent must have awareness and control over its environment with sensors and actuators : as a fuction callingmodel it has actuators and canread the directorys it has sensors ... its a start: as we can eget media in and out , but the model needs to get its own control to inpout and output also !

	Fine tuning : agin this issue of fine tuning : the disussion above eplains the requirement to control the environment from within the moel ( with constraints ) does this eliminate theneed to fine tune a model !
	in fact it should as this give transparency to ther growth ofthe model and if the model fine tuned itself we would be in danger of a model evolveing !
	hence an AGI !

	# LOAD MODEL

	```
	! git clone https://github.com/huggingface/transformers.git
	## copy modeling_mistral.py and configuartion.py to the Transformers foler / Src/models/mistral and overwrite the existing files first:
	## THEN :
	!cd transformers
	!pip install ./transformers

	```

	then restaet the environment: the model can then load without trust-remote and WILL work FINE !
	it can even be trained : hence the 4 bit optimised version ::

	``` Python


	# Load model directly
	from transformers import AutoTokenizer, AutoModelForCausalLM

	tokenizer = AutoTokenizer.from_pretrained("LeroyDyer/_Spydaz_Web_AI_MistralStar_V2", trust_remote_code=True)
	model = AutoModelForCausalLM.from_pretrained("LeroyDyer/_Spydaz_Web_AI_MistralStar_V2", trust_remote_code=True)
	model.tokenizer = tokenizer

	```