Whenever I got a chance to try and use Gemini AI model, the results were disappointing. The results reflected that the context was always not enough. However, other tools like ChatGPT, Github Copilot they were able to safely assume the default context and provide relevant answers. Only when the prompts were carefully crafted with TCREI framework (Tiny Crab Rode Enormous Iguana – Task, Context, Reference, Evaluate, Iterate), I got a decent response from Gemini. Until last week, I had ChatGPT (for personal) and Github Copilot (for work) as my top go-to AI assistants.
Image generated by gemini.google.com

I wanted to create an Android app that helps me convert images to pdf files. All the free apps available in Google Playstore are loaded with advertisements. I never had the time to create it completely. My attempts in the past faded several times – after refreshing few Android libraries, going through Udemy course or just after creating the home screen. The iText library looked simple but definitely has a learning curve. I was not interested in commercial SDKs.
Through ChatGPT, I came to know about the PdfDocument class available in Android itself. I thought to give it a shot and installed Android Studio again . Otter – The latest stable version of Android Studio has Gemini Pro built in and the auth process was smooth and easy. I was trying few prompts and the responses were detailed. I was asking about Java 25/21 support for Android and would it make sense to create the project in Java instead of Kotlin. Having used the Agent mode in Github Copilot at work, I started with a simple task to add a button to the existing main screen.
Gemini Agent gave a clean code. I increased the complexity of the tasks – and asked it to change the compose alignment – Eg. Columns instead of Box inside scaffold, create an image picker, show preview, create top bar, convert the list of uris to pdf, save them to a private folder, add settings screen, use the preferences while creating pdf files etc. Gemini Agent was able to do it with great explanations. Thanks to Google for giving free subscriptions for individual developers to their pro model – Gemini 2.5. It is definitly worth a try.
The agent also helped in writing Github Actions workflow yaml files and also a pristine and neat ReadMe.md with badges.
Within 4 days, I was able to get the Android app to a decent shape and even upload to Google Playstore.
Though most code is written by Gemini, I am still the proud owner for the few lines of code I added by hand. The source code is available here –> https://github.com/sudhans/Image2Pdf
Gotchas: There were a few times, the Agent was not able to refactor/write to the same file and it kept retrying. I had to force stop the task. When changes were made manually, the agent took it for granted and rolled back the manual changes – Eg. a deprecated method call was updated as per the documentation.
The Agents are here for a revolution. The impact will be on all industries – especially IT and how things are done. This also reminds me of the famous quote from the movie – iRobot
- Detective Del Spooner: Is there a problem with the Three Laws?
- Dr. Alfred Lanning: The Three Laws are perfect.
- Detective Del Spooner: Then why would you build a robot that could function without them?
- Dr. Alfred Lanning: The Three Laws will lead to only one logical outcome.
- Detective Del Spooner: What? What outcome?
- Dr. Alfred Lanning: Revolution.
- Detective Del Spooner: Whose revolution?
- Dr. Alfred Lanning: *That*, Detective, is the right question. Program terminated.
Note: cloudfare.com is down today and I got time for posting 🙂 Cloudfare took down Udemy, Kodecloud, Pluralsight, ChatGPT along with it.