0 Comentários
0 Compartilhamentos
4 Visualizações
Diretório
Elevate your Sngine platform to new levels with plugins from YubNub Digital Media!
-
Faça o login para curtir, compartilhar e comentar!
-
Xreals futuristic AR glasses are finally available at Amazon and Best BuyXreal's futuristic AR glasses are finally available at Amazon Where to buy the Xreal One Pro AR Glasses...0 Comentários 0 Compartilhamentos 7 Visualizações
-
WWW.LIVESCIENCE.COMJames Webb Space Telescope uncovers 300 mysteriously luminous objects. Are they galaxies or something else?Deep-field images from NASA's James Webb Space Telescope revealed 300 unusually energetic early galaxy candidates, offering new insights into how the universe formed and evolved over 13 billion years ago.0 Comentários 0 Compartilhamentos 4 Visualizações
-
WWW.LIVESCIENCE.COMNew Pluto mission could uncover dwarf planet's hidden ocean if the 'queen of the underworld' gets to flyA conceptual mission known as "Persephone" could explore Pluto and its moons for 50 years if it ever gets funded and approved.0 Comentários 0 Compartilhamentos 4 Visualizações
-
WWW.THEKITCHN.COMThe Sparkling Coconut Water Im Completely Obsessed With Right Now (So Much Better Than Seltzer!)You havent had anything like this before. READ MORE...0 Comentários 0 Compartilhamentos 4 Visualizações
-
WWW.BGR.COMAppleCare+ Vs AppleCare One: Use This Website To Pick The Right CoverageApple recently introduced a new AppleCare One bundle that could save you money on coverage, but this online tool will help you decide if it's worth bundling.0 Comentários 0 Compartilhamentos 5 Visualizações
-
WWW.BGR.COMAppleCare+ Vs AppleCare One: Use This Website To Pick The Right CoverageApple recently introduced a new AppleCare One bundle that could save you money on coverage, but this online tool will help you decide if it's worth bundling.0 Comentários 0 Compartilhamentos 5 Visualizações
-
-
BLOG.JETBRAINS.COMFine-Tuning and Deploying GPT Models Using Hugging Face TransformersHugging Face is currently a household name for machine learning researchers and enthusiasts. One of their biggest successes is Transformers, a model-definition framework for machine learning models in text, computer vision, audio, and video. Because of the vast repository of state-of-the-art machine learning models available on the Hugging Face Hub and the compatibility of Transformers with the majority of training frameworks, it is widely used for inference and model training.Why do we want to fine-tune an AI model?Fine-tuning AI models is crucial for tailoring their performance to specific tasks and datasets, enabling them to achieve higher accuracy and efficiency compared to using a general-purpose model. By adapting a pre-trained model, fine-tuning reduces the need for training from scratch, saving time and resources. It also allows for better handling of specific formats, nuances, and edge cases within a particular domain, leading to more reliable and tailored outputs.In this blog post, we will fine-tune a GPT model with mathematical reasoning so it better handles math questions.Using models from Hugging FaceWhen using PyCharm, we can easily browse and add any models from Hugging Face. In a new Python file, from the Code menu at the top, select Insert HF Model.In the menu that opens, you can browse models by category or start typing in the search bar at the top. When you select a model, you can see its description on the right.When you click Use Model, you will see a code snippet added to your file. And thats it Youre ready to start using your Hugging Face model.GPT (Generative Pre-Trained Transformer) modelsGPT models are very popular on the Hugging Face Hub, but what are they? GPTs are trained models that understand natural language and generate high-quality text. They are mainly used in tasks related to textual entailment, question answering, semantic similarity, and document classification. The most famous example is ChatGPT, created by OpenAI.A lot of OpenAI GPT models are available on the Hugging Face Hub, and we will learn how to use these models with Transformers, fine-tune them with our own data, and deploy them in an application.Benefits of using TransformersTransformers, together with other tools provided by Hugging Face, provides high-level tools for fine-tuning any sophisticated deep learning model. Instead of requiring you to fully understand a given models architecture and tokenization method, these tools help make models plug and play with any compatible training data, while also providing a large amount of customization in tokenization and training.Transformers in actionTo get a closer look at Transformers in action, lets see how we can use it to interact with a GPT model.Inference using a pretrained model with a pipelineAfter selecting and adding the OpenAI GPT-2 model to the code, this is what weve got:from transformers import pipelinepipe = pipeline("text-generation", model="openai-community/gpt2")Before we can use it, we need to make a few preparations. First, we need to install a machine learning framework. In this example, we chose PyTorch. You can install it easily via the Python Packages window in PyCharm.Then we need to install Transformers using the `torch` option. You can do that by using the terminal open it using the button on the left or use the F12 (MacOS) or Alt + F12 (Windows) hotkey.In the terminal, since we are using uv, we use the following commands to add it as a dependency and install it:uv add transformers[torch]uv syncIf you are using pip:pip install transformers[torch]We will also install a couple more libraries that we will need later, including python-dotenv, datasets, notebook, and ipywidgets. You can use either of the methods above to install them.After that, it may be best to add a GPU device to speed up the model. Depending on what you have on your machine, you can add it by setting the device parameter in pipeline. Since I am using a Mac M2 machine, I can set device="mps" like this:pipe = pipeline("text-generation", model="openai-community/gpt2", device="mps")If you have CUDA GPUs you can also set device="cuda".Now that weve set up our pipeline, lets try it out with a simple prompt:from transformers import pipelinepipe = pipeline("text-generation", model="openai-community/gpt2", device="mps")print(pipe("A rectangle has a perimeter of 20 cm. If the length is 6 cm, what is the width?", max_new_tokens=200))Run the script with the Run button () at the top:The result will look something like this:[{'generated_text': 'A rectangle has a perimeter of 20 cm. If the length is 6 cm, what is the width?\n\nA rectangle has a perimeter of 20 cm. If the length is 6 cm, what is the width? A rectangle has a perimeter of 20 cm. If the width is 6 cm, what is the width? A rectangle has a perimeter of 20 cm. If the width is 6 cm, what is the width? A rectangle has a perimeter of 20 cm. If the width is 6 cm, what is the width?\n\nA rectangle has a perimeter of 20 cm. If the width is 6 cm, what is the width? A rectangle has a perimeter of 20 cm. If the width is 6 cm, what is the width? A rectangle has a perimeter of 20 cm. If the width is 6 cm, what is the width? A rectangle has a perimeter of 20 cm. If the width is 6 cm, what is the width?\n\nA rectangle has a perimeter of 20 cm. If the width is 6 cm, what is the width? A rectangle has a perimeter'}]There isnt much reasoning in this at all, only a bunch of nonsense.You may also see this warning:Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.This is the default setting.You can also manually add it as below, so this warning disappears, but we dont have to worry about it too much at this stage.print(pipe("A rectangle has a perimeter of 20 cm. If the length is 6 cm, what is the width?", max_new_tokens=200, pad_token_id=pipe.tokenizer.eos_token_id))Now that weve seen how GPT-2 behaves out of the box, lets see if we can make it better at math reasoning with some fine-tuning.Load and prepare a dataset from the Hugging Face HubBefore we work on the GPT model, we first need training data. Lets see how to get a dataset from the Hugging Face Hub.If you havent already, sign up for a Hugging Face account and create an access token. We only need a `read` token for now. Store your token in a `.env` file, like so:HF_TOKEN=your-hugging-face-access-tokenWe will use this Math Reasoning Dataset, which has text describing some math reasoning. We will fine-tune our GPT model with this dataset so it can solve math problems more effectively.Lets create a new Jupyter notebook, which well use for fine-tuning because it lets us run different code snippets one by one and monitor the progress.In the first cell, we use this script to load the dataset from the Hugging Face Hub:from datasets import load_datasetfrom dotenv import load_dotenvimport osload_dotenv()dataset = load_dataset("Cheukting/math-meta-reasoning-cleaned", token=os.getenv("HF_TOKEN"))datasetRun this cell (it may take a while, depending on your internet speed), which will download the dataset. When its done, we can have a look at the result:DatasetDict({ train: Dataset({ features: ['id', 'text', 'token_count'], num_rows: 987485 })})If you are curious and want to have a peek at the data, you can do so in PyCharm. Open the Jupyter Variables window using the button on the right:Expand dataset and you will see the View as DataFrame option next to dataset[train]:Click on it to take a look at the data in the Data View tool window:Next, we will tokenize the text in the dataset:from transformers import GPT2Tokenizertokenizer = GPT2Tokenizer.from_pretrained("openai-community/gpt2")tokenizer.pad_token = tokenizer.eos_tokendef tokenize_function(examples): return tokenizer(examples['text'], truncation=True, padding='max_length', max_length=512)tokenized_datasets = dataset.map(tokenize_function, batched=True)Here we use the GPT-2 tokenizer and set the pad_token to be the eos_token, which is the token indicating the end of line. After that, we will tokenize the text with a function. It may take a while the first time you run it, but after that it will be cached and will be faster if you have to run the cell again.The dataset has almost 1 million rows for training. If you have enough computing power to process all of them, you can use them all. However, in this demonstration were training locally on a laptop, so Id better only use a small portion!tokenized_datasets_split = tokenized_datasets["train"].shard(num_shards=100, index=0).train_test_split(test_size=0.2, shuffle=True)tokenized_datasets_splitHere I take only 1% of the data, and then perform train_test_split to split the dataset into two:DatasetDict({ train: Dataset({ features: ['id', 'text', 'token_count', 'input_ids', 'attention_mask'], num_rows: 7900 }) test: Dataset({ features: ['id', 'text', 'token_count', 'input_ids', 'attention_mask'], num_rows: 1975 })})Now we are ready to fine-tune the GPT-2 model.Fine-tune a GPT modelIn the next empty cell, we will set our training arguments:from transformers import TrainingArgumentstraining_args = TrainingArguments( output_dir='./results', num_train_epochs=5, per_device_train_batch_size=8, per_device_eval_batch_size=8, warmup_steps=100, weight_decay=0.01, save_steps = 500, logging_steps=100, dataloader_pin_memory=False)Most of them are pretty standard for fine-tuning a model. However, depending on your computer setup, you may want to tweak a few things:Batch size Finding the optimal batch size is important, since the larger the batch size is, the faster the training goes. However, there is a limit to how much memory is available for your CPU or GPU, so you may find theres an upper threshold.Epochs Having more epochs causes the training to take longer. You can decide how many epochs you need.Save steps Save steps determine how often a checkpoint will be saved to disk. If the training is slow and there is a chance that it will stop unexpectedly, then you may want to save more often ( set this value lower).After weve configured our settings, we will put the trainer together in the next cell:from transformers import Trainer, DataCollatorForLanguageModelingdata_collator = DataCollatorForLanguageModeling(tokenizer=tokenizer, mlm=False)trainer = Trainer( model=model, args=training_args, train_dataset=tokenized_datasets_split['train'], eval_dataset=tokenized_datasets_split['test'], data_collator=data_collator,)trainer.train(resume_from_checkpoint=False)We set `resume_from_checkpoint=False`, but you can set it to `True` to continue from the last checkpoint if the training is interrupted.After the training finishes, we will evaluate and save the model:trainer.evaluate(tokenized_datasets_split['test'])trainer.save_model("./trained_model")We can now use the trained model in the pipeline. Lets switch back to `model.py`, where we have used a pipeline with a pretrained model:from transformers import pipelinepipe = pipeline("text-generation", model="openai-community/gpt2", device="mps")print(pipe("A rectangle has a perimeter of 20 cm. If the length is 6 cm, what is the width?", max_new_tokens=200, pad_token_id=pipe.tokenizer.eos_token_id))Now lets change `model=openai-community/gpt2` to `model=./trained_model` and see what we get:[{'generated_text': "A rectangle has a perimeter of 20 cm. If the length is 6 cm, what is the width?\nAlright, let me try to solve this problem as a student, and I'll let my thinking naturally fall into the common pitfall as described.\n\n---\n\n**Step 1: Attempting the Problem (falling into the pitfall)**\n\nWe have a rectangle with perimeter 20 cm. The length is 6 cm. We want the width.\n\nFirst, I need to find the area under the rectangle.\n\nLets set \\( A = 20 - 12 \\), where \\( A \\) is the perimeter.\n\n**Area under a rectangle:** \n\\[\nA = (20-12)^2 + ((-12)^2)^2 = 20^2 + 12^2 = 24\n\\]\n\nSo, \\( 24 = (20-12)^2 = 27 \\).\n\nNow, Ill just divide both sides by 6 to find the area under the rectangle.\n"}]Unfortunately, it still does not solve the problem. However, it did come up with some mathematical formulas and reasoning that it didnt use before. If you want, you can try fine-tuning the model a bit more with the data we didnt use.In the next section, we will see how we can deploy a fine-tuned model to API endpoints using both the tools provided by Hugging Face and FastAPI.Deploying a fine-tuned modelThe easiest way to deploy a model in a server backend is to use FastAPI. Previously, I wrote a blog post about deploying a machine learning model with Fast API. While we wont go into the same level of detail here, we will go over how to deploy our fine-tuned model.With the help of Junie, weve created some scripts which you can see here. These scripts let us deploy a server backend with FastAPI endpoints.There are some new dependencies that we need to add:uv add fastapi pydantic uvicornuv syncLets have a look at some interesting points in the scripts, in `main.py`:# Initialize FastAPI appapp = FastAPI( title="Text Generation API", description="API for generating text using a fine-tuned model", version="1.0.0")# Initialize the model pipelinetry: pipe = pipeline("text-generation", model="../trained_model", device="mps")except Exception as e: # Fallback to CPU if MPS is not available try: pipe = pipeline("text-generation", model="../trained_model", device="cpu") except Exception as e: print(f"Error loading model: {e}") pipe = NoneAfter initializing the app, the script will try to load the model into a pipeline. If a Metal GPU is not available, it will fall back to using the CPU. If you have a CUDA GPU instead of a Metal GPU, you can change `mps` to `cuda`.# Request modelclass TextGenerationRequest(BaseModel): prompt: str max_new_tokens: int = 200 # Response modelclass TextGenerationResponse(BaseModel): generated_text: strTwo new classes are created, inheriting from Pydantics `BaseModel`.We can also inspect our endpoints with the Endpoints tool window. Click on the globe next to `app = FastAPI` on line 11 and select Show All Endpoints.We have three endpoints. Since the root endpoint is just a welcome message, we will look at the other two.@app.post("/generate", response_model=TextGenerationResponse)async def generate_text(request: TextGenerationRequest): """ Generate text based on the provided prompt. Args: request: TextGenerationRequest containing the prompt and generation parameters Returns: TextGenerationResponse with the generated text """ if pipe is None: raise HTTPException(status_code=500, detail="Model not loaded properly") try: result = pipe( request.prompt, max_new_tokens=request.max_new_tokens, pad_token_id=pipe.tokenizer.eos_token_id ) # Extract the generated text from the result generated_text = result[0]['generated_text'] return TextGenerationResponse(generated_text=generated_text) except Exception as e: raise HTTPException(status_code=500, detail=f"Error generating text: {str(e)}")The `/generate` endpoint collects the request prompt and generates the response text with the model.@app.get("/health")async def health_check(): """Check if the API and model are working properly.""" if pipe is None: raise HTTPException(status_code=500, detail="Model not loaded") return {"status": "healthy", "model_loaded": True}The `/health` endpoint checks whether the model is loaded correctly. This can be useful if the client-side application needs to check before making the other endpoint available in its UI.In `run.py`, we use uvicorn to run the server:import uvicornif __name__ == "__main__": uvicorn.run("main:app", host="0.0.0.0", port=8000, reload=True)When we run this script, the server will be started at http://0.0.0.0:8000/.After we start running the server, we can go to http://0.0.0.0:8000/docs to test out the endpoints.We can try this with the `/generate` endpoint:{ "prompt": "5 people give each other a present. How many presents are given altogether?", "max_new_tokens": 300}This is the response we get:{ "generated_text": "5 people give each other a present. How many presents are given altogether?\nAlright, let's try to solve the problem:\n\n**Problem** \n1. Each person gives each other a present. How many presents are given altogether?\n2. How many \"gift\" are given altogether?\n\n**Common pitfall** \nAssuming that each present is a \"gift\" without considering the implications of the original condition.\n\n---\n\n### Step 1: Attempting the problem (falling into the pitfall)\n\nOkay, so I have two people giving each other a present, and I want to know how many are present. I remember that there are three types of giftsgifts, gins, and ginses.\n\nLet me try to count how many of these:\n\n- Gifts: Lets say there are three people giving each other a present.\n- Gins: Lets say there are three people giving each other a present.\n- Ginses: Lets say there are three people giving each other a present.\n\nSo, total gins and ginses would be:\n\n- Gins: \\( 2 \\times 3 = 1 \\), \\( 2 \\times 1 = 2 \\), \\( 1 \\times 1 = 1 \\), \\( 1 \\times 2 = 2 \\), so \\( 2 \\times 3 = 4 \\).\n- Ginses: \\( 2 \\times 3 = 6 \\), \\("}Feel free to experiment with other requests.Conclusion and next stepsNow that you have successfully fine-tuned an LLM model like GPT-2 with a math reasoning dataset and deployed it with FastAPI, you can fine-tune a lot more of the open-source LLMs available on the Hugging Face Hub. You can experiment with fine-tuning other LLM models with either the open-source data there or your own datasets. If you want to (and the license of the original model allows), you can also upload your fine-tuned model on the Hugging Face Hub. Check out their documentation for how to do that.One last remark regarding using or fine-tuning models with resources on the Hugging Face Hub make sure to read the licenses of any model or dataset that you use to understand the conditions for working with those resources. Is it allowed to be used commercially? Do you need to credit the resources used?In future blog posts, we will keep exploring more code examples involving Python, AI, machine learning, and data visualization.In my opinion, PyCharm provides best-in-class Python support that ensures both speed and accuracy. Benefit from the smartest code completion, PEP 8 compliance checks, intelligent refactorings, and a variety of inspections to meet all your coding needs. As demonstrated in this blog post, PyCharm provides integration with the Hugging Face Hub, allowing you to browse and use models without leaving the IDE. This makes it suitable for a wide range of AI and LLM fine-tuning projects. Download PyCharm Now0 Comentários 0 Compartilhamentos 5 Visualizações
-
FR.GAMERSLIVE.FRTest : SHINOBI: Art of Vengeance, C'est Ouf ! Un jeu entre tradition et...Test : SHINOBI: Art of Vengeance, C'est Ouf ! Un jeu entre tradition et...0 Comentários 0 Compartilhamentos 6 Visualizações