Create README.md
Browse files
    	
        README.md
    ADDED
    
    | @@ -0,0 +1,163 @@ | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            ---
         | 
| 2 | 
            +
            license: mit
         | 
| 3 | 
            +
            tags:
         | 
| 4 | 
            +
            - Mistral_Star
         | 
| 5 | 
            +
            - Mistral_Quiet
         | 
| 6 | 
            +
            - Mistral
         | 
| 7 | 
            +
            - Mixtral
         | 
| 8 | 
            +
            - Question-Answer
         | 
| 9 | 
            +
            - Token-Classification
         | 
| 10 | 
            +
            - Sequence-Classification
         | 
| 11 | 
            +
            - SpydazWeb-AI
         | 
| 12 | 
            +
            - chemistry
         | 
| 13 | 
            +
            - biology
         | 
| 14 | 
            +
            - legal
         | 
| 15 | 
            +
            - code
         | 
| 16 | 
            +
            - climate
         | 
| 17 | 
            +
            - medical
         | 
| 18 | 
            +
            - text-generation-inference
         | 
| 19 | 
            +
            language:
         | 
| 20 | 
            +
            - en
         | 
| 21 | 
            +
            - sw
         | 
| 22 | 
            +
            - ig
         | 
| 23 | 
            +
            - zu
         | 
| 24 | 
            +
            - ca
         | 
| 25 | 
            +
            - es
         | 
| 26 | 
            +
            - pt
         | 
| 27 | 
            +
            - ha
         | 
| 28 | 
            +
            pipeline_tag: text-generation
         | 
| 29 | 
            +
            ---
         | 
| 30 | 
            +
            # SpydazWeb AGI 
         | 
| 31 | 
            +
             | 
| 32 | 
            +
             | 
| 33 | 
            +
            This is based on the Quiet Star Reasoning Project : which was abandoned earlier in the year :)
         | 
| 34 | 
            +
            Current Update :
         | 
| 35 | 
            +
            This model is working , but actually untrained : to load the model it requires trust-remote=TRUE:: 
         | 
| 36 | 
            +
            But also if it does not load then you need to clone the github:
         | 
| 37 | 
            +
             | 
| 38 | 
            +
             | 
| 39 | 
            +
            ```
         | 
| 40 | 
            +
            ! git clone https://github.com/huggingface/transformers.git
         | 
| 41 | 
            +
            ## copy modeling_mistral.py and configuartion.py to the Transformers foler / Src/models/mistral and overwrite the existing files first: 
         | 
| 42 | 
            +
            ## THEN :
         | 
| 43 | 
            +
            !cd transformers
         | 
| 44 | 
            +
            !pip install  ./transformers
         | 
| 45 | 
            +
             | 
| 46 | 
            +
            ```
         | 
| 47 | 
            +
             | 
| 48 | 
            +
            then restaet the environment: the model can then load without trust-remote and WILL work FINE ! 
         | 
| 49 | 
            +
            it can even be trained : hence the 4 bit optimised version ::
         | 
| 50 | 
            +
             | 
| 51 | 
            +
             | 
| 52 | 
            +
             | 
| 53 | 
            +
            # Introduction :
         | 
| 54 | 
            +
             | 
| 55 | 
            +
            ## STAR REASONERS ! 
         | 
| 56 | 
            +
             | 
| 57 | 
            +
            this provides a platform for the model to commuicate pre-response , so an internal objective can be set ie adding an extra planning stage to the model improving its focus and output:
         | 
| 58 | 
            +
            the thought head can be charged with a thought or methodolgy, such as a ststing to take a step by step approach to the problem or to make an object oriented model first and consider the use cases before creating an output:
         | 
| 59 | 
            +
            so each thought head can be dedicated to specific ppurpose such as Planning or artifact generation or use case design : or even deciding which methodology should be applied before planning the potential solve route for the response :
         | 
| 60 | 
            +
            Another head could also be dedicated to retrieving content  based on the query from the self which can also be used in the pregenerations stages :
         | 
| 61 | 
            +
            all pre- reasoners can be seen to be Self Guiding ! essentially removing the requirement to give the model a system prompt instead aligning the heads to a thoght pathways ! 
         | 
| 62 | 
            +
            these chains produce data which can be considered to be thoughts : and  can further be displayed by framing these thoughts with thought tokens : even allowing for editors comments giving key guidance to the model during training : 
         | 
| 63 | 
            +
            these thoughts will be used in future genrations assisting the model as well a displaying explantory informations in the output :
         | 
| 64 | 
            +
             | 
| 65 | 
            +
            these tokens can be displayed or with held also a setting in the model ! 
         | 
| 66 | 
            +
             | 
| 67 | 
            +
            ### can this be applied in other areas ?
         | 
| 68 | 
            +
             | 
| 69 | 
            +
            Yes! , we can use this type of method to allow for the model to generate code in another channel or head potentially creating a head to produce artifacts for every output , or to produce entity lilsts for every output and framing the outputs in thier relative code tags or function call tags :
         | 
| 70 | 
            +
            these can also be displayed or hidden for the response . but these can also be used in problem solvibng tasks internally , which again enables for the model to simualte the inpouts and outputs from an interpretor ! 
         | 
| 71 | 
            +
            it may even be prudent to include a function executing internally to the model ! ( allowing the model to execute functions in the background! before responding ) as well this oul hae tpo also be specified in the config , as autoexecute or not !. 
         | 
| 72 | 
            +
             | 
| 73 | 
            +
            ### Conclusion 
         | 
| 74 | 
            +
             | 
| 75 | 
            +
            the resonaer methodology , might be seen to be the way forwards , adding internal funciton laity to the models instead of external connectivity enables for faster and seemless model usage : as well as enriched and informed responses , as even outputs could essentially be cleanss and formated before being presented to the Calling interface, internally to the model :
         | 
| 76 | 
            +
            the  take away is that arre we seeing the decoder/encoder model as simple a function of the inteligence which  in truth need to be autonomus ! 
         | 
| 77 | 
            +
            ie internal functions and tools as well as disk interaction : an agent must have awareness and control over its environment with sensors and actuators : as a fuction callingmodel it has actuators and canread the directorys it has sensors ... its a start: as we can eget media in and out , but the model needs to get its own control to inpout and output also ! 
         | 
| 78 | 
            +
            .... 
         | 
| 79 | 
            +
             | 
| 80 | 
            +
            Fine tuning :  agin this issue of fine tuning : the disussion above eplains the requirement to control the environment from within the moel ( with constraints ) does this eliminate theneed to fine tune a model ! 
         | 
| 81 | 
            +
            in fact it should as this give  transparency to ther growth ofthe model and if the model fine tuned itself we would be in danger of a model evolveing !
         | 
| 82 | 
            +
            hence an AGI ! 
         | 
| 83 | 
            +
             | 
| 84 | 
            +
            #### AI AGI ?
         | 
| 85 | 
            +
            so yes we can see we are not far from an ai which can evolve : an advance general inteligent system ( still non sentient by the way ) 
         | 
| 86 | 
            +
             | 
| 87 | 
            +
             | 
| 88 | 
            +
             | 
| 89 | 
            +
            <img src="https://cdn-avatars.huggingface.co/v1/production/uploads/65d883893a52cd9bcd8ab7cf/tRsCJlHNZo1D02kBTmfy9.jpeg" width="300"/>
         | 
| 90 | 
            +
            https://github.com/spydaz
         | 
| 91 | 
            +
             | 
| 92 | 
            +
                * 32k context window (vs 8k context in v0.1)
         | 
| 93 | 
            +
                * Rope-theta = 1e6
         | 
| 94 | 
            +
                * No Sliding-Window Attention
         | 
| 95 | 
            +
                * Talk heads  - produce resposnes which can be used towards the final output
         | 
| 96 | 
            +
                * Pre-Thoughts  - Enables for pre-generation steps of potential artifacts for task solving: 
         | 
| 97 | 
            +
                  * Generates plans for step by step thinking 
         | 
| 98 | 
            +
                  * Generates python Code Artifacts for future tasks
         | 
| 99 | 
            +
                  * Recalls context for task internally to be used as refference for task:
         | 
| 100 | 
            +
                * show thoughts or hidden thought usages ( Simular to self-Rag )
         | 
| 101 | 
            +
             | 
| 102 | 
            +
                
         | 
| 103 | 
            +
            This model will be a custom model with internal experts and rag systems
         | 
| 104 | 
            +
            enabling for preprocessing of the task internally before outputting a response 
         | 
| 105 | 
            +
             | 
| 106 | 
            +
            ## SpydazWeb AI model :
         | 
| 107 | 
            +
             | 
| 108 | 
            +
            This model is based on the worlds archive of knowledge maintaining historical documents and providing services for the survivors of mankind ,
         | 
| 109 | 
            +
            who may need to construct shelters develop technologys , or medical resources as well as maintain the history of the past . keeping store of all the religious knowledge and data of the world:
         | 
| 110 | 
            +
            A friendly interface with a personality caring and flirtatious at times : non binary !...
         | 
| 111 | 
            +
            and Expert in all feilds: ie Uncensored and will not refuse to give information : the model can be used for role play as many character dialogues were als trained into the model as its personality to enable a greater perspective and outlook and natural discussion with the agents:
         | 
| 112 | 
            +
            the model was trained to operateinaragenvironment utilizing content and internal knowledge to respond to questions or create enriched sumarys.
         | 
| 113 | 
            +
             | 
| 114 | 
            +
             | 
| 115 | 
            +
             | 
| 116 | 
            +
            ### General Intenal Methods:
         | 
| 117 | 
            +
             | 
| 118 | 
            +
            Trained for multi-task operations as well as rag and function calling :
         | 
| 119 | 
            +
             | 
| 120 | 
            +
            This model is a fully functioning model and is fully uncensored: 
         | 
| 121 | 
            +
             | 
| 122 | 
            +
            the model has been trained on multiple datasets on the huggingface hub and kaggle :
         | 
| 123 | 
            +
             | 
| 124 | 
            +
            the focus has been mainly on methodology : 
         | 
| 125 | 
            +
             | 
| 126 | 
            +
            * Chain of thoughts
         | 
| 127 | 
            +
            * step by step planning
         | 
| 128 | 
            +
            * tree of thoughts
         | 
| 129 | 
            +
            * forest of thoughts
         | 
| 130 | 
            +
            * graph of thoughts
         | 
| 131 | 
            +
            * agent generation : Voting, ranking, ... dual agent response generation:
         | 
| 132 | 
            +
             | 
| 133 | 
            +
            with these methods the model has gained insights into tasks, enabling for knowldge transfer between tasks :
         | 
| 134 | 
            +
             | 
| 135 | 
            +
            the model has been intensivly trained in recalling data previously entered into the matrix:
         | 
| 136 | 
            +
            The model has also been trained on rich data and markdown outputs as much as possible : 
         | 
| 137 | 
            +
            the model can also generate markdown charts with mermaid.
         | 
| 138 | 
            +
             | 
| 139 | 
            +
             | 
| 140 | 
            +
            ## Training Reginmes:
         | 
| 141 | 
            +
              * Alpaca
         | 
| 142 | 
            +
              * ChatML / OpenAI / MistralAI
         | 
| 143 | 
            +
              * Text Generation
         | 
| 144 | 
            +
              * Question/Answer (Chat)
         | 
| 145 | 
            +
              * Instruction/Input/Response (instruct)
         | 
| 146 | 
            +
              * Mistral Standard Prompt
         | 
| 147 | 
            +
              * Translation Tasks
         | 
| 148 | 
            +
              * Entitys / Topic detection
         | 
| 149 | 
            +
              * Book recall
         | 
| 150 | 
            +
              * Coding challenges, Code Feedback, Code Sumarization, Commenting Code
         | 
| 151 | 
            +
              * Agent Ranking and response anyalisis
         | 
| 152 | 
            +
              * Medical tasks
         | 
| 153 | 
            +
                * PubMed
         | 
| 154 | 
            +
                * Diagnosis
         | 
| 155 | 
            +
                * Psychaitry
         | 
| 156 | 
            +
                * Counselling
         | 
| 157 | 
            +
                * Life Coaching
         | 
| 158 | 
            +
                * Note taking
         | 
| 159 | 
            +
                * Medical smiles
         | 
| 160 | 
            +
                * Medical Reporting
         | 
| 161 | 
            +
              * Virtual laboritys simulations
         | 
| 162 | 
            +
              * Chain of thoughts methods
         | 
| 163 | 
            +
              * One shot / Multi shot prompting tasks
         | 
