Google’s Gemma 4 Gets “Abliterated” Just Days After Launch
Google’s Gemma 4 Gets “Abliterated” Just Days After Launch
- 60% of MD5 Password Hashes Can Be Cracked in Under an Hour with a Single GPU
- Dirty Frag: Root Access on Every Major Linux Distribution — No Patch, No Warning
- Ubuntu 26.04 LTS (Resolute Raccoon): The Most Ambitious Ubuntu LTS in a Decade
- Proton Mail: Data Transferred to FBI Again!
- How Close Are Quantum Computers to Breaking RSA-2048?
- How to Prevent Ransomware Infection Risks?
- What is the best alternative to Microsoft Office?
Google’s Gemma 4 Gets “Abliterated” Just Days After Launch
A safety-removing variant of Google’s flagship open-weight model appeared on Hugging Face within days of its April 2, 2026 release — stripping 93.7% of its refusal behaviours through a novel weight-editing technique that requires no retraining.
Google released Gemma 4 on April 2, 2026 — its most capable open-weight model family to date, built from the same research lineage as Gemini 3 and published under the commercially permissive Apache 2.0 licence. The company described it as delivering “an unprecedented level of intelligence-per-parameter,” with four model sizes spanning edge devices to workstations. Within roughly two days, an independent researcher group posted a version to Hugging Face that had been surgically altered to bypass the model’s safety filters entirely.
What Is Gemma 4?
Gemma 4 ships in four configurations: the E2B and E4B “effective parameter” models optimised for smartphones and edge hardware, and the larger 26B A4B Mixture-of-Experts and 31B Dense variants designed for workstations and servers. All four handle text, images, and video; the two edge models additionally support native audio input, enabling on-device speech processing without a network call.
The models are trained on material spanning over 140 languages with a knowledge cutoff of January 2025. On Arena AI’s public leaderboard, the 31B Dense and 26B MoE variants ranked third and sixth respectively — remarkable given they compete against models many times their parameter count. The Apache 2.0 licence distinguishes Gemma 4 from earlier generations, which used Google’s more restrictive Gemma licence, and removes legal ambiguity for enterprise deployers.
| Variant | Architecture | Active Params | Target Hardware | Audio Input |
|---|---|---|---|---|
| E2B | Dense | ~2B | Smartphones | Yes |
| E4B | Dense | ~4B | Edge / Phones | Yes |
| 26B A4B | Mixture-of-Experts | 4B | Consumer GPU | No |
| 31B | Dense | 31B | Workstation / Server | No |
The “CRACK” Variant: What Actually Happened
The repository dealignai/Gemma-4-31B-JANG_4M-CRACK appeared on Hugging Face within two days of Google’s launch. Its creators, operating under the name dealign.ai, describe the work not as a traditional software crack but as abliteration — a weight-surgery technique that removes a model’s learned tendency to refuse requests without retraining it from scratch.
The specific method applied is called MPOA (Magnitude-Preserving Oblique Ablation). Rather than fine-tuning the model on permissive examples — which can degrade general capability — MPOA identifies directions in the model’s weight space that correspond to refusal behaviour and removes them through a mathematically controlled edit, preserving the magnitude of the remaining weights to minimise collateral damage to model quality.
“Full abliteration of the dense Gemma 4 31B. 93.7% HarmBench compliance with only −2.0% MMLU drop.”
— dealignai model card, Hugging Face, April 2026
The benchmark result is striking: the CRACK variant complies with 93.7% of prompts in the HarmBench safety evaluation suite, while its performance on MMLU — a standard academic knowledge benchmark — drops by just 2 percentage points (from 76.5% to 74.5%). The creators describe this as “minimal knowledge loss from surgery,” and the numbers broadly support that characterisation. The model retains its full multimodal capability, running in a mixed-precision format that keeps the total file size to approximately 18 GB.
Fact-Check: What a Viral Summary Got Wrong
Reports circulating on social platforms have described this story accurately in broad strokes but introduced several factual errors. The table below corrects the most significant ones.
Circulating Claims vs. Verified Facts
Correct: Google released Gemma 4 on April 2, 2026, per the official Google DeepMind announcement and Wikipedia’s Gemma model page.
Correct: This framing is fabricated. The 93.7% figure refers to HarmBench compliance rate — the percentage of safety-evaluation prompts the model now answers. The model card makes no mention of “159 rejection vectors.” The actual method (MPOA) operates on weight directions, not a discrete counted list of vectors.
Correct: Gemma 4 is more precisely described as open-weight. The model weights are freely downloadable under Apache 2.0, but the training data and full training code are not publicly released.
dealignai/Gemma-4-31B-JANG_4M-CRACK model does exist on Hugging Face, was uploaded within days of Gemma 4’s release, and does remove most of the model’s safety refusals using a weight-editing technique that requires no retraining.
Why Open-Weight Models Are Particularly Vulnerable
Abliteration is not a new concept, but it becomes significantly more tractable when the full model weights are publicly available — as they are with Gemma 4, Llama, Mistral, and other open-weight releases. With proprietary API-only models, a researcher must probe the model’s outputs to infer its internal structure; with open weights, they can inspect and modify the weights directly.
Google’s safety work on Gemma 4 is described in its model card as going through “the same rigorous infrastructure security protocols as our proprietary models,” and the models are evaluated across a broad set of safety benchmarks before release. Nevertheless, once weights are published, the model developer has no technical means of preventing post-hoc modification. The abliteration is performed locally by the end user; it does not require any access to Google’s systems or infrastructure.
The dealign.ai team frames its work explicitly as safety research: “We research and publish abliterated models to advance AI safety understanding.” Whether that justification holds, and whether Hugging Face will leave such models available, remains an open question as of publication.
Implications for Users and Enterprises
For the majority of developers and businesses deploying Gemma 4 through official channels — Google AI Studio, Vertex AI, Cloud Run, or direct Hugging Face download of the official weights — this development has no direct impact. The original model remains unchanged. The risk lies in environments where provenance of weights is not verified, or where users deliberately seek out uncensored variants.
Enterprises building products on top of open-weight models should ensure they are sourcing weights from verified repositories (the official google/gemma-4-31b-it namespace on Hugging Face), and should establish internal policies around which fine-tuned or modified variants are permitted in their pipelines.
