How to set up a LLM locally in your laptop? at Data Science Speakers

On the 30th of May I presented at Data Sciences Speakers Club the talk “How to set up a LLM locally in your laptop”. This presentation was covering the steps to set up the Gemma model. The Gemma model is a recent model developed by Google that can be run on desktops while still maintaining best-in-class performance. I ran these steps in my two-year-old laptop with a 16GB RAM, Intel i7 processor and no GPU!

The steps and slides are as below:

  1. First of all, download Python, the most convenient way is through Anaconda
  2. Go to Hugging Face and create an account and activate your token, this will allow you to download the LLM model
  3. Install Pytorch that will allow to use the transformers and perform all the LLM vector operations. Make sure that you are selecting the correct option of operating system and if you have or not a GPU card. In my case I didn’t have so I selected the CPU option
  4. Now time to head to Anaconda and Python and install the transformer library (either through PIP or Conda) and then use your token key from Hugging Face so you are able to download the LLM
  5. Download the model gemma-2b-it. This is the reduced version of Gemma model that works on a similar way as a ChatGPT prompt version. There is a 7B parameter version but only try this one if you have a beefy computer (with more than 16GM RAM, with GPU or a M3 processor)

With this you are all set to use it and try the LLM. Make sure to adjust the max_lenght that is the output size. I generally use it from 200 to 500 depending on how big you are expecting the outputs.

It is very useful for general knowledge questions and also for general coding questions! No more API prompting under this new method!

Below you can find the slides with the steps, code and examples I used in the presentation

For more details about Data Science Speakers club and to attend future meetings find it in here

Cheers,

Eduardo

Leave a comment