Export¶

xaytune provides utilities for saving, merging, and converting models after training.

merge()¶

Merge LoRA/QLoRA adapters back into the base model, producing a standalone model that can be used without PEFT.

from xaytune.export.merge import merge

merge("output/lora-finetune", save_to="output/merged")

Function Signature¶

def merge(checkpoint_path: str, *, save_to: str) -> None:

Parameter	Type	Description
`checkpoint_path`	`str`	Path to a LoRA/QLoRA checkpoint directory
`save_to`	`str`	Output directory for the merged model

The merged model and tokenizer are saved in Hugging Face format, ready for inference or further export.

Warning

merge() only works with PEFT (LoRA/QLoRA) checkpoints. It raises ValueError if the checkpoint is not a PEFT model.

save()¶

Save a model and tokenizer to disk with optional metadata.

from xaytune.export.merge import save

save(
    model,
    tokenizer,
    output_dir="output/my-model",
    metadata={"recipe": "finetune", "method": "lora", "epochs": 3},
)

Function Signature¶

def save(
    model: Any,
    tokenizer: Any,
    *,
    output_dir: str,
    metadata: dict[str, Any] | None = None,
) -> None:

Parameter	Type	Description
`model`	model object	The model to save
`tokenizer`	tokenizer object	The tokenizer to save
`output_dir`	`str`	Output directory
`metadata`	`dict` \| `None`	Optional metadata written to `xaytune_metadata.json`

push_to_hub()¶

Push a model and tokenizer to the Hugging Face Hub.

from xaytune.export.hub import push_to_hub

# From a saved directory
push_to_hub("output/merged", repo="username/my-model")

# From model objects
push_to_hub(model, repo="username/my-model", tokenizer=tokenizer)

Function Signature¶

def push_to_hub(
    model_or_path: Any,
    *,
    repo: str | None = None,
    tokenizer: Any | None = None,
) -> None:

Parameter	Type	Description
`model_or_path`	model object or `str`	A model instance or path to a saved model
`repo`	`str`	HF Hub repository (e.g., `"username/model-name"`)
`tokenizer`	tokenizer object \| `None`	Tokenizer to push alongside the model (auto-loaded if `model_or_path` is a string)

Note

You must be authenticated with the Hugging Face Hub. Run huggingface-cli login first.

to_gguf()¶

Convert a model to GGUF format for use with llama.cpp and compatible inference engines.

from xaytune.export.gguf import to_gguf

to_gguf(
    "output/merged",
    output="model.gguf",
    quantization="Q4_K_M",
)

Function Signature¶

def to_gguf(
    model_path: str,
    *,
    output: str,
    quantization: str = "Q4_K_M",
) -> None:

Parameter	Type	Default	Description
`model_path`	`str`	required	Path to the model directory
`output`	`str`	required	Output file path for the GGUF file
`quantization`	`str`	`"Q4_K_M"`	GGUF quantization type

Common quantization types: Q4_0, Q4_K_M, Q5_K_M, Q8_0, F16.

CLI Usage¶

All export operations are available through the xaytune export subcommand:

# Merge LoRA adapters
xaytune export merge --checkpoint output/lora-finetune --output output/merged

# Convert to GGUF
xaytune export gguf --model output/merged --output model.gguf --quant Q4_K_M

# Push to Hugging Face Hub
xaytune export push --model output/merged --repo username/my-model

Typical Export Pipeline¶

A common post-training workflow:

# 1. Train with LoRA
xaytune train --config configs/examples/lora_finetune.yaml

# 2. Merge adapters into base model
xaytune export merge --checkpoint output/lora-finetune --output output/merged

# 3a. Push to Hub for cloud inference
xaytune export push --model output/merged --repo username/my-model

# 3b. Or convert to GGUF for local inference
xaytune export gguf --model output/merged --output model.gguf

Full API Reference¶

`merge(checkpoint_path, *, save_to)` ¶

Merge LoRA/QLoRA adapter weights into the base model and save.

Parameters:

Name	Type	Description	Default
`checkpoint_path`	`str`	Path to a PEFT checkpoint directory.	required
`save_to`	`str`	Directory where the merged model and tokenizer are saved.	required

Raises:

Type	Description
`ValueError`	If the checkpoint is not a PEFT model.

Source code in xaytune/export/merge.py

def merge(checkpoint_path: str, *, save_to: str) -> None:
    """Merge LoRA/QLoRA adapter weights into the base model and save.

    Args:
        checkpoint_path: Path to a PEFT checkpoint directory.
        save_to: Directory where the merged model and tokenizer are saved.

    Raises:
        ValueError: If the checkpoint is not a PEFT model.
    """
    from xaytune.models import load_model

    model_result = load_model(checkpoint_path)

    if not model_result.peft_applied:
        raise ValueError(
            f"Model at '{checkpoint_path}' is not a PEFT model. "
            f"merge() only works with LoRA/QLoRA checkpoints."
        )

    merged_model = model_result.model.merge_and_unload()

    Path(save_to).mkdir(parents=True, exist_ok=True)
    merged_model.save_pretrained(save_to)
    model_result.tokenizer.save_pretrained(save_to)

`save(model, tokenizer, *, output_dir, metadata=None)` ¶

Save a model, tokenizer, and optional metadata to disk.

Parameters:

Name	Type	Description	Default
`model`	`Any`	A HuggingFace model instance.	required
`tokenizer`	`Any`	The associated tokenizer.	required
`output_dir`	`str`	Directory to write files to (created if missing).	required
`metadata`	`dict[str, Any] \| None`	Optional dict written as `xaytune_metadata.json`.	`None`

Source code in xaytune/export/merge.py

def save(
    model: Any,
    tokenizer: Any,
    *,
    output_dir: str,
    metadata: dict[str, Any] | None = None,
) -> None:
    """Save a model, tokenizer, and optional metadata to disk.

    Args:
        model: A HuggingFace model instance.
        tokenizer: The associated tokenizer.
        output_dir: Directory to write files to (created if missing).
        metadata: Optional dict written as ``xaytune_metadata.json``.
    """
    path = Path(output_dir)
    path.mkdir(parents=True, exist_ok=True)

    model.save_pretrained(output_dir)
    tokenizer.save_pretrained(output_dir)

    if metadata:
        meta_path = path / "xaytune_metadata.json"
        meta_path.write_text(json.dumps(metadata, indent=2))

`push_to_hub(model_or_path, *, repo=None, tokenizer=None)` ¶

Push a model and tokenizer to the HuggingFace Hub.

Parameters:

Name	Type	Description	Default
`model_or_path`	`Any`	A model instance or path to a saved checkpoint. If a path string, the model and tokenizer are loaded automatically.	required
`repo`	`str \| None`	Hub repository id (e.g. `"username/model-name"`).	`None`
`tokenizer`	`Any \| None`	Tokenizer to push alongside the model. Ignored when model_or_path is a string (tokenizer is loaded from checkpoint).	`None`

Raises:

Type	Description
`ValueError`	If repo is not provided.

Source code in xaytune/export/hub.py

def push_to_hub(
    model_or_path: Any,
    *,
    repo: str | None = None,
    tokenizer: Any | None = None,
) -> None:
    """Push a model and tokenizer to the HuggingFace Hub.

    Args:
        model_or_path: A model instance or path to a saved checkpoint.
            If a path string, the model and tokenizer are loaded automatically.
        repo: Hub repository id (e.g. ``"username/model-name"``).
        tokenizer: Tokenizer to push alongside the model. Ignored when
            *model_or_path* is a string (tokenizer is loaded from checkpoint).

    Raises:
        ValueError: If *repo* is not provided.
    """
    if repo is None:
        raise ValueError("'repo' is required (e.g., 'username/model-name').")

    if isinstance(model_or_path, str):
        from xaytune.models import load_model

        model_result = load_model(model_or_path)
        model = model_result.model
        tokenizer = model_result.tokenizer
    else:
        model = model_or_path

    model.push_to_hub(repo)
    if tokenizer is not None:
        tokenizer.push_to_hub(repo)

`to_gguf(model_path, *, output, quantization='Q4_K_M')` ¶

Convert a HuggingFace model to GGUF format via llama.cpp.

Parameters:

Name	Type	Description	Default
`model_path`	`str`	Path to a saved HuggingFace model directory.	required
`output`	`str`	Destination path for the `.gguf` file.	required
`quantization`	`str`	Quantization scheme (default `"Q4_K_M"`).	`'Q4_K_M'`

Raises:

Type	Description
`FileNotFoundError`	If model_path does not exist.
`RuntimeError`	If the conversion subprocess fails.

Source code in xaytune/export/gguf.py

def to_gguf(
    model_path: str,
    *,
    output: str,
    quantization: str = "Q4_K_M",
) -> None:
    """Convert a HuggingFace model to GGUF format via llama.cpp.

    Args:
        model_path: Path to a saved HuggingFace model directory.
        output: Destination path for the ``.gguf`` file.
        quantization: Quantization scheme (default ``"Q4_K_M"``).

    Raises:
        FileNotFoundError: If *model_path* does not exist.
        RuntimeError: If the conversion subprocess fails.
    """
    model_dir = Path(model_path)
    if not model_dir.exists():
        raise FileNotFoundError(f"Model directory not found: {model_path}")

    output_path = Path(output)
    output_path.parent.mkdir(parents=True, exist_ok=True)

    cmd = [
        sys.executable,
        "-m",
        "llama_cpp.convert",
        str(model_dir),
        "--outfile",
        str(output_path),
        "--outtype",
        quantization,
    ]

    result = subprocess.run(cmd, capture_output=True, text=True)
    if result.returncode != 0:
        raise RuntimeError(f"GGUF conversion failed: {result.stderr}")

Export¶

merge()¶

Function Signature¶

save()¶

Function Signature¶

push_to_hub()¶

Function Signature¶

to_gguf()¶

Function Signature¶

CLI Usage¶

Typical Export Pipeline¶

Full API Reference¶

merge(checkpoint_path, *, save_to) ¶

save(model, tokenizer, *, output_dir, metadata=None) ¶

push_to_hub(model_or_path, *, repo=None, tokenizer=None) ¶

to_gguf(model_path, *, output, quantization='Q4_K_M') ¶

`merge(checkpoint_path, *, save_to)` ¶

`save(model, tokenizer, *, output_dir, metadata=None)` ¶

`push_to_hub(model_or_path, *, repo=None, tokenizer=None)` ¶

`to_gguf(model_path, *, output, quantization='Q4_K_M')` ¶