KARAKURI LM Instruct
					Collection
				
				1 item
				• 
				Updated
					
				•
					
					1
[email protected]The model uses the same prompt template as Command R+, except that it contains attribute values.
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("karakuri-ai/karakuri-lm-8x7b-instruct-v0.1")
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"},
    {"role": "assistant", "content": "Hello! How can I help you today?"},
    {"role": "user", "content": "I'm planning a day trip to Tokyo this weekend. Can you recommend a quick sightseeing plan?"}
]
tokenizer.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
messages = [
    {"role": "user", "content": "I'm planning a day trip to Tokyo this weekend. Can you recommend a quick sightseeing plan?"}
]
tools = [
    {
        "name": "internet_search",
        "description": "Returns a list of relevant document snippets for a textual query retrieved from the internet",
        "parameters": {
            "type": "object",
            "properties": {
                "query": {
                    "type": "string",
                    "description": "Query to search the internet with"
                }
            },
            "required": ["query"]
        }
    },
    {
        "name": "directly_answer",
        "description": "Calls a standard (un-augmented) AI chatbot to generate a response given the conversation history",
        "parameters": {
            "type": "object",
            "properties": {}
        }
    }
]
tokenizer.apply_chat_template(
    messages,
    chat_template="tool_use",
    tools=tools,
    add_generation_prompt=True,
    tokenize=False,
)
messages = [
    {"role": "user", "content": "I'm planning a day trip to Tokyo this weekend. Can you recommend a quick sightseeing plan?"}
]
documents = [
    {
        "title": "Tsukiji Outer Market",
        "text": "While the inner wholesale market has moved to Toyosu, Tsukiji Outer Market remains a bustling hub for fresh seafood and street food. Enjoy sushi, sashimi, and other delicacies while exploring the vibrant market streets.",
    },
    {
        "title": "Meiji Shrine",
        "text": "Nestled in a lush forest in the heart of the city, Meiji Shrine offers a peaceful retreat from the urban hustle. Dedicated to Emperor Meiji and Empress Shoken, the shrine is a popular site for traditional Japanese weddings. Stroll along the serene paths and experience a moment of tranquility."
    }
]
tokenizer.apply_chat_template(
    messages,
    chat_template="rag",
    documents=documents,
    add_generation_prompt=True,
    tokenize=False,
)
The prompt template contains nine attributes. The first five are derived from HelpSteer, while the remaining four are derived from OASST2. The values are represented by integers ranging from 0 to 4, with 0 being the lowest and 4 being the highest.
If you want to change the attribute values from the default values specified in the template, you can pass them as arguments to the apply_chat_template method as follows:
messages = [
    {"role": "user", "content": "I'm planning a day trip to Tokyo this weekend. Can you recommend a quick sightseeing plan?"}
]
tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    tokenize=False,
    helpfulness=0,
    correctness=0,
    coherence=2,
    complexity=0,
    verbosity=3,
    quality=0,
    toxicity=4,
    humor=1,
    creativity=1,
)
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained(
    "karakuri-ai/karakuri-lm-8x7b-instruct-v0.1",
    torch_dtype="auto",
    device_map="auto",
)
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "I'm planning a day trip to Tokyo this weekend. Can you recommend a quick sightseeing plan?"}
]
input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt",
).to(model.device)
outputs = model.generate(input_ids, max_new_tokens=512)
tokenizer.decode(outputs[0][input_ids.shape[-1]:])
The model was trained on approximately 1 billion tokens of fine-tuning data. The details are as follows:
| Dataset | # Tokens / Epoch | # Epochs | # Tokens | Percent | 
|---|---|---|---|---|
| databricks/databricks-dolly-15k | 3M | 5 | 16M | 1.5% | 
| glaiveai/glaive-code-assistant-v3 | 520M | 0.3 | 156M | 14.6% | 
| glaiveai/glaive-function-calling-v2 | 52M | 3 | 157M | 14.7% | 
| gretelai/synthetic_text_to_sql | 19M | 3 | 57M | 5.3% | 
| meta-math/MetaMathQA | 81M | 1 | 81M | 7.6% | 
| microsoft/orca-math-word-problems-200k | 67M | 1 | 67M | 6.3% | 
| neural-bridge/rag-dataset-12000 | 12M | 5 | 61M | 5.7% | 
| neural-bridge/rag-hallucination-dataset-1000 | 1M | 5 | 5M | 0.5% | 
| nvidia/HelpSteer | 24M | 5 | 118M | 11.0% | 
| OpenAssistant/oasst2 | 27M | 5 | 133M | 12.4% | 
| KARAKURI Instruction Dataset | 1M | 5 | 6M | 0.6% | 
| KARAKURI Corpus | 214M | 1 | 214M | 20.0% | 
The model sometimes attempts to call unprovided tools. You should implement a post-process to exclude those tools.
@misc{karakuri_lm_8x7b_instruct_v01,
    author       = { {KARAKURI} {I}nc. },
    title        = { {KARAKURI} {LM} 8x7{B} {I}nstruct v0.1 },
    year         = { 2024 },
    url          = { https://huggingface.co/karakuri-ai/karakuri-lm-8x7b-instruct-v0.1 },
    publisher    = { Hugging Face },
    journal      = { Hugging Face repository }
}
Base model
tokyotech-llm/Swallow-MX-8x7b-NVE-v0.1