ibl.ai's Custom Safety & Moderation Layers in mentorAI
An explainer of mentorAI’s custom safety & moderation layer for higher ed: how domain-scoped assistants sit on top of base-model alignment to enforce campus policies, cite approved sources, and politely refuse out-of-scope requests—consistent behavior across Canvas (LTI 1.3), web, and mobile without over-permitting access.
Most large language models arrive with built-in alignment. Helpful—but not nearly specific enough for a university’s norms, risk posture, and use cases. What campuses tell us they need is an extra layer of governance that’s theirs: assistants that stay in their lane, cite approved materials, and politely refuse anything out of scope. That’s exactly how we designed mentorAI’s safety & moderation layer: an additive guardrail system that sits above the base model to enforce institutional policy and domain boundaries, wherever the assistant is deployed (LMS via LTI, web, or mobile).
What “Domain-Scoped” Actually Means
When you scope an assistant to a domain (e.g., Admissions, Intro to Epidemiology, Academic Integrity Policy), you’re setting hard boundaries:- Topics: What the assistant may discuss (allowlist) and what it must decline (denylist).
- Sources: Which documents it can cite (e.g., syllabus, slides, policy PDFs)—and what’s off-limits.
- Audience & role: How it should respond to students vs. instructors vs. prospective students.
- Refusal behavior: The exact language and escalation path when a request is out of scope.
A Concrete Example From The Field
On a Syracuse University deployment, a prospective-student assistant is scoped to admissions and IT FAQs on the public site. Ask it “What’s the best pizza in New York?” and it declines—not because the base model can’t answer, but because our moderation layer instructs it to answer only within the approved domain (and to redirect helpfully). That same pattern applies to course assistants: keep Q&A inside the course’s approved materials, decline unrelated or prohibited topics, and cite sources so students know where the answer came from.How The Layer Works (Without Over-Permitting Anything)
- Additive policy prompts: We bind policy to each assistant: allowed topics, tone, refusal templates, and escalation guidelines.
- Server-side checks: We validate intent against the policy before retrieval or tool use; off-domain requests are intercepted and declined.
- Scoped retrieval (optional): If you enable RAG, we only retrieve from the assistant’s approved corpus. If you don’t enable RAG, the assistant still respects topic boundaries.
- Privacy-aware identity: Institutions choose what identity, if any, to pass via LTI (anonymous, pseudonymous ID, or email) and the assistant adapts behavior accordingly.
Designed for Governance and Faculty Trust
- Configurable, not opaque: Instructors and admins can review and adjust policy text (scope, tone, refusal language) so the assistant reflects local pedagogy.
- Transparent refusals: When questions fall outside scope, the assistant explains why and suggests approved channels or resources.
- Auditable (with options): Activity logs can be enabled to surface patterns (e.g., repeated questions the syllabus should clarify) while respecting your chosen privacy mode.
- Multi-tenant by design: Each school, program, or course can run its own policies and corpora, isolated from others.
Why Higher Ed Needs an Extra Layer (Beyond Base Alignment)
- Institutional values, enforced: Base models don’t know your policies. Your assistants should.
- Reduced off-topic drift: Domain boundaries keep conversations relevant and reduce hallucination risk.
- Consistent student experience: The same, policy-aligned behavior across course sites, advising pages, and public-facing assistants.
- Lower operational risk: Clear refusals and scoped sources make compliance reviews and stakeholder sign-off simpler.
The Takeaway
Base-model alignment is a starting point. Campuses need their own safety & moderation layer to keep assistants on mission: scoped topics, approved sources, consistent refusals, and governance faculty can understand and adjust. If you want to see a domain-scoped assistant politely refuse out-of-bounds questions—and help students faster inside your policies—visit https://ibl.ai/contactRelated Articles
Security-First LMS Integration
A practical, standards-aligned overview of how mentorAI integrates with Canvas, Blackboard, and Brightspace using admin-registered LTI 1.3, optional, IT-approved RAG ingest, and course-scoped links—delivering security, transparency, and instructor control without fragile workarounds.
How ibl.ai Keeps Your Campus’s Carbon Footprint Flat
This article outlines how ibl.ai’s mentorAI enables campuses to scale generative AI without scaling emissions. By right-sizing models, running a single multi-tenant back end, enforcing token-based (pay-as-you-go) budgets, leveraging RAG to cut token waste, and choosing green hosting (renewable clouds, on-prem, or burst-to-green regions), universities keep energy use—and Scope 2 impact—flat even as usage rises. Built-in telemetry pairs with carbon-intensity data to surface real-time CO₂ per student metrics, aligning AI strategy with institutional climate commitments.
From One Syllabus to Many Paths: Agentic AI for 100% Personalized Learning
A practical guide to building governed, explainable, and truly personalized learning experiences with ibl.ai—combining modality-aware coaching, rubric-aligned feedback, LTI/API plumbing, and an auditable memory layer to adapt pathways without sacrificing academic control.
No Vendor Lock-In, Full Code & Data Ownership with ibl.ai
Own your AI application layer. Ship the whole stack, keep code and data in your perimeter, run multi-tenant deployments, choose your LLMs, and integrate via LTI—no vendor lock-in.