UNL College of Engineering AI Makerspace Documentation

Various documents and guides for using the UNL College of Engineering's AI Makerspace

Makerspace Information¶

Accessing the Makerspace¶

To get access to the makerspace, you must complete both:

An online bridge course: Enroll
An in-person onboarding session: Book a Session

Web Interface / JupyterLab¶

The web interface and JupyterLab for the makerspace is located at https://unl-coe-ai-makerspace.nrp-nautilus.io/ .

Command Line / Kubernetes¶

To get access to more resources and customization, you will need to use the command line or Kubernetes API.

You will need to be added to a namespace / group on the National Research Platform to be able to use any resources, and only specific groups have access to the AI Makerspace GPU Box.

The following namespaces / groups currently have access to the DGX box.

huskerai - Contact Seth Polsley for access
unl-coe-huskerai-jupyter - Used for the Jupyter instances only.

Large Language Models (LLMs)¶

Pre-Hosted Models¶

The National Rearch Platform has many pre-hosted models accessible from both a web interface and from an API at no charge.

The NRP team runs an Open WebUI instance with all of the models already available: NRP Open WebUI

API access is also available at no cost: NRP LLM API Documentation

Last Updated: June 9, 2026

Model	NRP/API Name	GPU Count	Parameters	Input Types	Use Case
MiniMaxAI/MiniMax-M2.7	minimax-m2	4x A100 80GB PCIe	230B	--	Cost-efficient or high-throughput agentic coding Long-context code review and refactoring
Qwen/Qwen3-VL-Embedding-8B	qwen3-embedding	2x Tesla V100‑SXM2‑32GB	8B	image, video	Vector databases and semantic search RAG pipelines Multimodal retrieval
Qwen/Qwen3.5-397B-A17B-FP8	qwen3	8x A100‑SXM4‑80GB	397B	image, video	Frontier-quality text and multimodal reasoning Long-context document and repository analysis Research workflows requiring reproducibility
Qwen/Qwen3.6-27B	qwen3‑small	2x H200 NVL 4x RTX A6000	27B	image, video	Latency-sensitive multimodal tasks Agentic coding and tool use Long-context tasks where qwen3 is overkill
Qwen/Qwen3.6-27B	qwen3‑small	4x RTX A6000	27B	image, video	Latency-sensitive multimodal tasks Agentic coding and tool use Long-context tasks where qwen3 is overkill
google/gemma-4-31B-it	gemma	2x RTX A6000	31B	image, video	Multimodal tasks (image/video QA, visual analysis) Efficient general-purpose assistant Workflows where reasoning is occasional, not constant Reproducible research (pinnable model)
google/gemma-4-E4B-it	gemma‑4‑e4b	1x GeForce RTX 3090 1x RTX 5000 Ada Generation	8B	image, video, audio	Audio transcription and speech-to-text workflows Lightweight multimodal tasks Fast, low-cost inference for simple queries
moonshotai/Kimi‑K2.6	kimi	8x A100‑SXM4‑80GB	1T	image, video	Agentic coding Large-repo code understanding Multimodal coding tasks (UI screenshots, diagrams)
moonshotai/Kimi‑K2.6	kimi	4x RTX PRO 6000 Blackwell  Max‑Q Workstation Edition	1T	image, video	Agentic coding Large-repo code understanding Multimodal coding tasks (UI screenshots, diagrams)
nvidia/GLM‑5.1‑NVFP4	glm‑5	4x H200 NVL	744B	--	Agentic coding workflows Long-form reasoning and text tasks Tool-using agents
openai/gpt‑oss‑120b	gpt‑oss	2x RTX A6000	120B	--	General-purpose chat and assistants Agentic tool-using workflows Reproducible research (pinnable model)

Models and information provided by NRP's LLM Documentation ¹

Running your own model¶

If you can use the pre-hosted models, please do so.

The models already hosted by NRP have dedicated resources for the models and are larger parameter models.

Instructions are a WIP

https://nrp.ai/documentation/userdocs/ai/llm-managed/models/ ↩