UNL College of Engineering AI Makerspace Documentation
Various documents and guides for using the UNL College of Engineering's AI Makerspace
Makerspace Information¶
Accessing the Makerspace¶
To get access to the makerspace, you must complete both:
- An online bridge course: Enroll
- An in-person onboarding session: Book a Session
Web Interface / JupyterLab¶
The web interface and JupyterLab for the
Command Line / Kubernetes¶
Large Language Models (LLMs)¶
Pre-Hosted Models¶
The National Rearch Platform has many pre-hosted models accessible from both a web interface and from an API at no charge.
The NRP team runs an Open WebUI instance with all of the models already available: NRP Open WebUI
API access is also available at no cost: NRP LLM API Documentation
Last Updated: June 9, 2026
| Model | NRP/API Name | GPU Count | Parameters | Input Types | Use Case |
|---|---|---|---|---|---|
| MiniMaxAI/MiniMax-M2.7 | minimax-m2 | 4x A100 80GB PCIe | 230B | -- | Cost-efficient or high-throughput agentic coding Long-context code review and refactoring |
| Qwen/Qwen3-VL-Embedding-8B | qwen3-embedding | 2x Tesla V100‑SXM2‑32GB | 8B | image, video | Vector databases and semantic search RAG pipelines Multimodal retrieval |
| Qwen/Qwen3.5-397B-A17B-FP8 | qwen3 | 8x A100‑SXM4‑80GB | 397B | image, video | Frontier-quality text and multimodal reasoning Long-context document and repository analysis Research workflows requiring reproducibility |
| Qwen/Qwen3.6-27B | qwen3‑small | 2x H200 NVL 4x RTX A6000 |
27B | image, video | Latency-sensitive multimodal tasks Agentic coding and tool use Long-context tasks where qwen3 is overkill |
| Qwen/Qwen3.6-27B | qwen3‑small | 4x RTX A6000 | 27B | image, video | Latency-sensitive multimodal tasks Agentic coding and tool use Long-context tasks where qwen3 is overkill |
| google/gemma-4-31B-it | gemma | 2x RTX A6000 | 31B | image, video | Multimodal tasks (image/video QA, visual analysis) Efficient general-purpose assistant Workflows where reasoning is occasional, not constant Reproducible research (pinnable model) |
| google/gemma-4-E4B-it | gemma‑4‑e4b | 1x GeForce RTX 3090 1x RTX 5000 Ada Generation |
8B | image, video, audio | Audio transcription and speech-to-text workflows Lightweight multimodal tasks Fast, low-cost inference for simple queries |
| moonshotai/Kimi‑K2.6 | kimi | 8x A100‑SXM4‑80GB | 1T | image, video | Agentic coding Large-repo code understanding Multimodal coding tasks (UI screenshots, diagrams) |
| moonshotai/Kimi‑K2.6 | kimi | 4x RTX PRO 6000 Blackwell Max‑Q Workstation Edition |
1T | image, video | Agentic coding Large-repo code understanding Multimodal coding tasks (UI screenshots, diagrams) |
| nvidia/GLM‑5.1‑NVFP4 | glm‑5 | 4x H200 NVL | 744B | -- | Agentic coding workflows Long-form reasoning and text tasks Tool-using agents |
| openai/gpt‑oss‑120b | gpt‑oss | 2x RTX A6000 | 120B | -- | General-purpose chat and assistants Agentic tool-using workflows Reproducible research (pinnable model) |
Models and information provided by NRP's LLM Documentation 1
Running your own model¶
If you can use the pre-hosted models, please do so.
The models already hosted by NRP have dedicated resources for the models and are larger parameter models.
Instructions are a WIP