How to Use Gemma 4 in Android Studio with Ollama: Local AI Coding for Kotlin, Flutter, and Gradle

Coding Liquids blog cover featuring Sagnik Bhattacharya for How to Use Gemma 4 in Android Studio with Ollama, showing Android Studio IDE with local AI coding assistant integration.
Coding Liquids blog cover featuring Sagnik Bhattacharya for How to Use Gemma 4 in Android Studio with Ollama, showing Android Studio IDE with local AI coding assistant integration.

This is the companion guide to my VS Code setup post, which became one of the most-read articles on this site within days of publishing. Dozens of readers asked the same follow-up question: "Does this work in Android Studio too?" The short answer is yes — with some important differences in how you connect Ollama to a JetBrains-based IDE versus VS Code. This guide covers everything you need to know.

Android Studio is built on IntelliJ IDEA, which means the extension ecosystem is different from VS Code's marketplace. You will not find identical plugins, and the native inline completion behaviour works differently. But the core principle remains the same: Gemma 4 running locally through Ollama can serve as a free, private, offline-capable AI coding assistant for your Kotlin, Flutter, and Gradle workflows — all without sending a single line of code to an external server.

I have tested these setups across native Kotlin Android projects, Flutter apps built in Android Studio, and complex multi-module Gradle builds. This guide walks through the practical steps, shows real workflow examples for each technology, and gives you an honest assessment of where local AI shines and where it falls short in the Android development ecosystem.

Prerequisites

Before you begin, make sure you have the following in place:

  1. Ollama installed and running. Download it from ollama.com and verify it is active by opening any terminal and running ollama list. You should see your installed models. If you have not set up Ollama before, start with my beginner's guide to running Gemma 4 locally.
  2. Gemma 4 pulled. Run ollama pull gemma4 for the default 12B model, or ollama pull gemma4:27b for the larger variant. The 12B download is roughly 7GB; the 27B is around 16GB.
  3. Android Studio Ladybug (2024.2) or later. Earlier versions work, but Ladybug introduced improvements to the integrated terminal and plugin compatibility that make this workflow smoother. You can check your version under Help > About.
  4. Adequate hardware. The same guidance from the VS Code guide applies here, but Android Studio itself is more memory-hungry than VS Code. For a comfortable experience running Android Studio, an emulator, and Gemma 4 simultaneously, I recommend at least 32GB of system RAM and a GPU with 8GB+ VRAM. If you are running on 16GB of RAM, close the emulator when using Gemma 4 for chat tasks, or use a physical device for testing.

One additional note for Android developers: if you are already using Gemini Code Assist (Google's cloud-based AI built into Android Studio), Ollama and Gemma 4 complement it rather than replace it. I will cover how the two compare later in this article.

Connecting Ollama to Android Studio

Unlike VS Code, Android Studio does not have a one-click Ollama integration out of the box. The connection happens through two channels: the integrated terminal and JetBrains-compatible plugins.

Verifying Ollama Is Running

Open Android Studio's integrated terminal (View > Tool Windows > Terminal, or press Alt+F12). Run the following commands to confirm Ollama is accessible:

# Check Ollama is running
ollama list

# Verify the API is responding
curl http://localhost:11434

You should see "Ollama is running" from the curl command, and your list of pulled models from ollama list. If Ollama is not responding, start the Ollama application from your system tray (Windows/Linux) or Applications folder (macOS). On Linux, you may need to run ollama serve in a separate terminal first.

Testing a Quick Query

Before setting up any plugins, verify that Gemma 4 responds correctly from within Android Studio's terminal:

curl http://localhost:11434/api/generate -d '{
  "model": "gemma4",
  "prompt": "Write a Kotlin data class for a User with id, name, and email fields.",
  "stream": false
}'

If you get a JSON response containing Kotlin code, your local AI pipeline is working. Everything from here is about making this connection more convenient through plugins and workflow patterns.

Method 1: Continue Plugin in Android Studio

Continue is the best option for integrating a local LLM into Android Studio. It is the same open-source tool I recommended for VS Code, and it has a JetBrains version available on the JetBrains Marketplace.

Installation

  1. In Android Studio, go to File > Settings > Plugins (or Android Studio > Preferences > Plugins on macOS).
  2. Click the Marketplace tab and search for "Continue".
  3. Install the plugin by Continue.dev and restart Android Studio when prompted.
  4. After restart, you will see a Continue icon in the right sidebar. Click it to open the panel.

Configuring for Ollama and Gemma 4

Continue stores its configuration in ~/.continue/config.json. Open this file (Continue's settings panel has a link to it) and configure it for Gemma 4:

{
  "models": [
    {
      "title": "Gemma 4 27B",
      "provider": "ollama",
      "model": "gemma4:27b"
    }
  ],
  "tabAutocompleteModel": {
    "title": "Gemma 4 12B",
    "provider": "ollama",
    "model": "gemma4:12b"
  },
  "tabAutocompleteOptions": {
    "debounceDelay": 600,
    "multilineCompletions": "always"
  }
}

I set the debounceDelay slightly higher for Android Studio (600ms versus 500ms in VS Code) because the IDE already consumes more memory, and you want to avoid overwhelming the GPU with autocomplete requests while Android Studio's indexer or Gradle sync is running in the background.

Using Continue in Android Studio

The core workflows are identical to the VS Code version:

  • Chat: Select code in the editor, press Ctrl+L (Cmd+L on Mac), and ask Gemma 4 to explain, refactor, or generate tests for it.
  • Inline editing: Select code, press Ctrl+I (Cmd+I on Mac), describe the change you want, and review the generated diff.
  • Tab autocomplete: Continue offers tab-completion suggestions as you type. Accept them with Tab or dismiss with Escape.

One important difference from VS Code: JetBrains IDEs have their own built-in autocomplete system that competes with Continue's suggestions. If you find the two systems conflicting, go to Settings > Editor > General > Code Completion and adjust the settings so that Continue's AI suggestions do not clash with Android Studio's native completions. In practice, I find they coexist well — Android Studio's completions handle API method names and imports, whilst Continue handles larger multi-line suggestions.

Method 2: Terminal-Based Workflows

Not every interaction needs a plugin. Android Studio's integrated terminal gives you direct access to Ollama, and for many tasks this is faster than switching to a chat panel.

Quick Queries with ollama run

Open the terminal in Android Studio (Alt+F12) and run interactive queries directly:

ollama run gemma4 "Explain what viewModelScope.launch does in Kotlin coroutines and when to use it versus lifecycleScope."

For multi-line prompts or when you want to paste code, use the interactive mode:

ollama run gemma4

This opens an interactive session where you can type or paste code blocks and have a back-and-forth conversation. Type /bye to exit. This approach is particularly useful for quick "how do I do X" questions that do not warrant opening a full chat panel.

Piping Files for Review

You can pipe entire files to Gemma 4 for review directly from the terminal. This is powerful for code review workflows:

cat app/src/main/java/com/example/UserRepository.kt | ollama run gemma4 "Review this Kotlin file for potential issues, anti-patterns, and improvements. Focus on coroutine usage and error handling."

On Windows, use type instead of cat, or use Git Bash (which Android Studio can be configured to use as its default terminal).

Kotlin Workflows with Gemma 4

Kotlin is where Android developers spend most of their time, and Gemma 4 handles it well. Here are the workflows I use most frequently.

Generating Data Classes

Prompt Gemma 4 (via Continue chat or terminal) with your API response structure, and it generates the corresponding Kotlin data classes with serialisation annotations:

// Prompt: "Generate Kotlin data classes for this JSON API response.
// Include @Serializable annotations for kotlinx.serialization.
// The response has a list of products, each with id (Int), name (String),
// price (Double), category (nested object with id and name), and
// tags (list of strings)."

@Serializable
data class ProductResponse(
    val products: List<Product>
)

@Serializable
data class Product(
    val id: Int,
    val name: String,
    val price: Double,
    val category: Category,
    val tags: List<String>
)

@Serializable
data class Category(
    val id: Int,
    val name: String
)

Gemma 4 correctly uses @Serializable from kotlinx.serialization rather than older Gson annotations, which shows it has been trained on recent Kotlin conventions. It also handles nullable fields well when you specify them in the prompt.

Jetpack Compose Components

Compose UI generation is one of Gemma 4's stronger areas for Kotlin. Describe a component and it produces idiomatic Compose code:

// Prompt: "Create a Jetpack Compose card component for displaying a user profile.
// Include avatar image (circular), name, role, and an action button.
// Use Material 3 styling."

@Composable
fun UserProfileCard(
    name: String,
    role: String,
    avatarUrl: String,
    onActionClick: () -> Unit,
    modifier: Modifier = Modifier
) {
    Card(
        modifier = modifier.fillMaxWidth(),
        shape = RoundedCornerShape(16.dp),
        colors = CardDefaults.cardColors(
            containerColor = MaterialTheme.colorScheme.surface
        )
    ) {
        Row(
            modifier = Modifier.padding(16.dp),
            verticalAlignment = Alignment.CenterVertically
        ) {
            AsyncImage(
                model = avatarUrl,
                contentDescription = "$name avatar",
                modifier = Modifier
                    .size(56.dp)
                    .clip(CircleShape),
                contentScale = ContentScale.Crop
            )
            Spacer(modifier = Modifier.width(16.dp))
            Column(modifier = Modifier.weight(1f)) {
                Text(
                    text = name,
                    style = MaterialTheme.typography.titleMedium
                )
                Text(
                    text = role,
                    style = MaterialTheme.typography.bodyMedium,
                    color = MaterialTheme.colorScheme.onSurfaceVariant
                )
            }
            FilledTonalButton(onClick = onActionClick) {
                Text("View")
            }
        }
    }
}

The output is clean and follows Material 3 conventions. It correctly uses AsyncImage from Coil (the standard image loading library for Compose), applies proper modifiers, and structures the layout idiomatically. You will typically need to add the import statements yourself, but the generated code compiles without modification once imports are in place.

Ktor Networking Code

For networking with Ktor (increasingly popular as a Kotlin-native alternative to Retrofit), Gemma 4 generates solid boilerplate:

// Prompt: "Write a Ktor HTTP client setup for Android with JSON serialisation,
// logging, and a function to fetch a list of products from /api/products."

Gemma 4 produces a complete HttpClient configuration with the ContentNegotiation plugin, kotlinx.serialization, and the Logging plugin. It handles suspending functions correctly and wraps the call in appropriate error handling. The 27B model is particularly good here — it remembers to include the Android engine dependency and sets reasonable timeout values.

Room Database Entities

Ask Gemma 4 to generate Room entity classes, DAOs, and database definitions. It handles the annotation-heavy boilerplate that Room requires and correctly generates @Entity, @PrimaryKey, @Dao, and @Database annotations. I find it especially useful for generating the DAO interface with common query patterns — it produces @Insert(onConflict = OnConflictStrategy.REPLACE), @Query with Flow return types, and @Transaction annotations where appropriate.

Flutter and Dart Workflows

Many Android developers use Android Studio for Flutter development as well. Gemma 4 supports Dart, though its Dart knowledge is not as deep as its Kotlin coverage. Here is what works well.

Widget Generation

Describe a widget and Gemma 4 generates the corresponding StatelessWidget or StatefulWidget with proper build methods:

// Prompt: "Create a Flutter StatefulWidget for a search bar with debounced
// text input that calls an onSearch callback after 500ms of inactivity."

class DebouncedSearchBar extends StatefulWidget {
  final ValueChanged<String> onSearch;
  final String hintText;

  const DebouncedSearchBar({
    super.key,
    required this.onSearch,
    this.hintText = 'Search...',
  });

  @override
  State<DebouncedSearchBar> createState() => _DebouncedSearchBarState();
}

class _DebouncedSearchBarState extends State<DebouncedSearchBar> {
  final _controller = TextEditingController();
  Timer? _debounceTimer;

  @override
  void dispose() {
    _debounceTimer?.cancel();
    _controller.dispose();
    super.dispose();
  }

  void _onChanged(String value) {
    _debounceTimer?.cancel();
    _debounceTimer = Timer(
      const Duration(milliseconds: 500),
      () => widget.onSearch(value),
    );
  }

  @override
  Widget build(BuildContext context) {
    return TextField(
      controller: _controller,
      onChanged: _onChanged,
      decoration: InputDecoration(
        hintText: widget.hintText,
        prefixIcon: const Icon(Icons.search),
        border: OutlineInputBorder(
          borderRadius: BorderRadius.circular(12),
        ),
      ),
    );
  }
}

The generated code properly handles Timer disposal, uses super.key (the modern Dart 3 syntax), and follows Flutter naming conventions. For straightforward widgets, Gemma 4 is reliable.

State Management

Gemma 4 can generate Riverpod providers, Bloc classes, or plain ChangeNotifier implementations depending on what you ask for. When prompted for Riverpod code, the 27B model produces correct @riverpod annotations and generated code patterns. The 12B model occasionally confuses older Riverpod 1.x syntax with the newer code-generation approach, so use the larger model for state management tasks.

Package Setup and Configuration

Ask Gemma 4 to generate pubspec.yaml dependency blocks or platform-specific configuration files. For example: "What dependencies do I need in pubspec.yaml for a Flutter app with Riverpod state management, GoRouter navigation, Dio networking, and Hive local storage?" It produces the correct package names and reasonable version constraints, though you should always verify version numbers against pub.dev since training data has a cut-off date.

Gradle Help: Build Configuration and Dependency Management

This is the section that might surprise you. Gradle build configuration is one of the areas where Gemma 4 provides the most practical value in an Android Studio workflow, because Gradle files are notoriously difficult to write from memory and the official documentation can be dense.

build.gradle.kts Generation

Ask Gemma 4 to generate module-level build.gradle.kts files with specific requirements:

// Prompt: "Generate a build.gradle.kts for an Android app module using
// Kotlin 2.0, Compose, Hilt for DI, Room for database, and Ktor for networking.
// Use version catalogs for dependency management."

plugins {
    alias(libs.plugins.android.application)
    alias(libs.plugins.kotlin.android)
    alias(libs.plugins.kotlin.compose)
    alias(libs.plugins.hilt.android)
    alias(libs.plugins.ksp)
}

android {
    namespace = "com.example.myapp"
    compileSdk = 35

    defaultConfig {
        applicationId = "com.example.myapp"
        minSdk = 26
        targetSdk = 35
        versionCode = 1
        versionName = "1.0.0"
    }

    buildFeatures {
        compose = true
    }

    compileOptions {
        sourceCompatibility = JavaVersion.VERSION_17
        targetCompatibility = JavaVersion.VERSION_17
    }

    kotlinOptions {
        jvmTarget = "17"
    }
}

dependencies {
    // Compose
    implementation(platform(libs.compose.bom))
    implementation(libs.compose.ui)
    implementation(libs.compose.material3)
    implementation(libs.compose.ui.tooling.preview)
    debugImplementation(libs.compose.ui.tooling)

    // Hilt
    implementation(libs.hilt.android)
    ksp(libs.hilt.compiler)
    implementation(libs.hilt.navigation.compose)

    // Room
    implementation(libs.room.runtime)
    implementation(libs.room.ktx)
    ksp(libs.room.compiler)

    // Ktor
    implementation(libs.ktor.client.android)
    implementation(libs.ktor.client.content.negotiation)
    implementation(libs.ktor.serialization.kotlinx.json)
}

Gemma 4 correctly uses KSP instead of KAPT (the modern annotation processing approach), references version catalog aliases rather than hard-coded version strings, and applies the Compose compiler plugin through the Kotlin Compose plugin rather than the older composeOptions block. These are the kinds of details that save you fifteen minutes of searching the migration guides.

Version Catalog Setup

Ask Gemma 4 to generate a complete libs.versions.toml file for your project's dependencies. Provide the libraries you are using and it produces the [versions], [libraries], and [plugins] sections with correct group IDs, artifact IDs, and version references. This is particularly useful when migrating an older project from hard-coded dependencies to the version catalog system.

Multi-Module Configuration

For larger projects with multiple modules, Gemma 4 can generate settings.gradle.kts files with proper module includes, convention plugins for shared build logic, and module-level build.gradle.kts files that reference a shared build-logic module. Describe your module structure — for example, "I have :app, :core:data, :core:domain, :core:ui, :feature:home, and :feature:profile" — and it generates the appropriate configuration for each.

Dependency Resolution Problems

Paste a Gradle sync error into the Continue chat or the terminal, and Gemma 4 is remarkably good at diagnosing the issue. Common problems like version conflicts, missing repositories, incorrect plugin application order, and JVM target mismatches are identified quickly with clear fix instructions. This alone makes Ollama worth setting up — Gradle errors are one of the most time-consuming parts of Android development, and having an AI that can parse the error output and suggest a fix is genuinely useful.

Performance Tips: Optimising Gemma 4 for Android Studio

Android Studio is a resource-intensive application. Running Gemma 4 alongside it requires some thought about resource allocation.

Model Size Selection

Model Min VRAM Response Speed Code Quality Best For
Gemma 4 4B 4GB Very fast (100-300ms) Basic completions Low-end hardware, quick snippets
Gemma 4 12B 8GB Fast (300-800ms) Good for most tasks Best balance for daily use
Gemma 4 27B 16GB Moderate (800ms-2s) Excellent for complex logic Chat, Gradle debugging, architecture

For Android Studio specifically, I recommend starting with the 12B model for autocomplete and switching to the 27B model only for chat-based tasks like Gradle debugging or architecture discussions. If you are running Android Studio with an emulator simultaneously, the 4B model might be your only realistic option unless you have 32GB+ of RAM and a high-end GPU.

GPU Offloading

Verify that Ollama is utilising your GPU by running ollama ps in the terminal. You should see the GPU listed as the processor. If it shows CPU, check that your GPU drivers are up to date and that Ollama detects your hardware with ollama show gemma4 --system. On NVIDIA systems, ensure CUDA is properly installed. On Apple Silicon, GPU acceleration works automatically through Metal.

Context Window Tuning

Reduce the context window for autocomplete tasks to 2048 tokens. The default context window is larger, which consumes more VRAM and slows responses. For chat tasks where you are pasting longer code blocks, you can leave the context at the default size. In Continue's config, you can set this per model:

{
  "models": [
    {
      "title": "Gemma 4 27B",
      "provider": "ollama",
      "model": "gemma4:27b",
      "contextLength": 8192
    }
  ],
  "tabAutocompleteModel": {
    "title": "Gemma 4 12B",
    "provider": "ollama",
    "model": "gemma4:12b",
    "contextLength": 2048
  }
}

Gemma 4 vs Gemini Code Assist vs GitHub Copilot in Android Studio

Android Studio developers have three main AI assistant options. Here is how they compare.

Feature Gemma 4 + Ollama (Local) Gemini Code Assist GitHub Copilot
Cost Free Free tier available; paid plans for teams $19/month individual
Privacy Fully local — code never leaves your machine Cloud-based — code sent to Google servers Cloud-based — code sent to GitHub servers
Offline availability Yes No No
Native Android Studio integration Via Continue plugin Built-in (first-party) Via JetBrains plugin
Inline completions Via Continue (not native) Native inline suggestions Native inline suggestions
Kotlin code quality Good (27B model) Excellent (trained on Android codebase) Very good
Gradle understanding Good — handles common patterns Very good — deep Android build system knowledge Good
Flutter/Dart support Adequate Good Good
Project context awareness Limited to active file + prompt Indexes project files Indexes open files
Setup effort Moderate (Ollama + model + plugin) Minimal (sign in with Google) Minimal (install plugin, sign in)
Response speed Hardware-dependent Consistently fast Consistently fast

My recommendation: If you have no restrictions on cloud usage and budget is not a concern, Gemini Code Assist is the strongest choice for Android Studio because it is built by Google specifically for the Android ecosystem. If you need privacy, offline capability, or simply want a free option, Gemma 4 through Ollama is the best local alternative. Copilot sits in between — excellent for polyglot developers who work across multiple IDEs and languages, but it lacks the Android-specific depth that Gemini Code Assist offers.

For a deeper comparison of Gemma 4 against other AI models across all categories, see my full AI model comparison for 2026.

Limitations You Should Know About

Honesty about limitations helps you set realistic expectations. Here is where the local Gemma 4 experience in Android Studio falls short.

  • No native inline completions in JetBrains. Unlike VS Code where Continue's inline completions feel nearly native, the JetBrains integration is less seamless. Completions appear as suggestions but can occasionally conflict with Android Studio's own autocomplete. This is a JetBrains platform limitation, not a Gemma 4 issue — it affects all third-party AI completion providers.
  • Slower than cloud-based tools. Gemini Code Assist and Copilot run on powerful server hardware and deliver responses in under 300ms consistently. Gemma 4 on consumer hardware ranges from 300ms (4B model) to 2+ seconds (27B model). For inline completions, this delay is noticeable. For chat tasks, it is acceptable.
  • Context limitations. Gemma 4 does not index your entire project. It works with the code you explicitly provide in the prompt or that your extension sends from the active file. This means it might suggest importing a library you are not using or miss project-specific patterns. Cloud tools that index your workspace handle this better.
  • Resource competition with Android Studio. Android Studio and Gradle already consume significant CPU and RAM. Adding Gemma 4 to the mix can cause slowdowns during Gradle sync or when running the emulator. On machines with less than 32GB RAM, you will feel this.
  • Dart knowledge gaps. While Gemma 4 handles Kotlin well, its Dart and Flutter knowledge is less comprehensive. It occasionally generates deprecated widget patterns or uses older package versions. Always verify Flutter code against the latest documentation.

Frequently Asked Questions

Can I use Gemma 4 in Android Studio for Flutter development specifically?

Yes. The setup is identical — install Ollama, pull Gemma 4, and configure Continue in Android Studio. Gemma 4 generates Dart code and Flutter widgets, though its Flutter knowledge is not as deep as its Kotlin knowledge. For widget generation, state management boilerplate, and basic Dart functions, it works well. For Flutter-specific platform channel code or complex animations, you may need to provide more context in your prompts or use the 27B model for better results. If Flutter is your primary focus, I cover additional Flutter-specific AI workflows in my guide to building Flutter apps with AI.

Does Gemma 4 work with Android Studio on Apple Silicon Macs?

Yes, and it works well. Ollama uses Metal for GPU acceleration on Apple Silicon, which means the M1 Pro, M2, M3, and M4 series chips all handle Gemma 4 efficiently. A Mac with 16GB unified memory runs the 12B model comfortably alongside Android Studio. With 32GB or 64GB, you can run the 27B model and still have headroom for the Android emulator. The unified memory architecture on Apple Silicon is actually an advantage here — there is no separate GPU VRAM to worry about, as the model shares the same memory pool as the rest of the system.

How does this compare to just using Gemini in Android Studio?

Gemini Code Assist (the cloud-based tool built into Android Studio) offers a smoother, more integrated experience because it is a first-party Google product designed specifically for Android development. It has better project context awareness, faster response times, and deeper understanding of Android-specific APIs and patterns. Gemma 4 through Ollama wins on privacy (fully local), cost (completely free with no usage limits), and offline availability. If your company prohibits sending code to external servers, or if you frequently work without internet access, Gemma 4 is the better choice. For everyone else, Gemini Code Assist's free tier is worth trying first, and you can add Gemma 4 as a supplement for sensitive projects or offline scenarios.

Sources and Further Reading

Related Posts

Want to use AI tools more effectively?

My courses cover practical AI workflows, from spreadsheet formulas to app development, with real projects and honest tool comparisons.

Browse all courses