Close-up Cutting Stuff

https://theawesomer.com/photos/2022/01/cutting_stuff_close_up_macro_t.jpg

Close-up Cutting Stuff

Link

Object pointed their macro camera lens at ordinary objects like wires and bolts so we could see what they look like when they’re cut in half. The slow-motion and sound effects help to heighten the impact of the visuals.

The Awesomer

Real Python: Build a Django Front End With Bulma – Part 2

In this four-part tutorial series, you’re building a social network with Django that you can showcase in your portfolio. This project will strengthen your understanding of relationships between Django models and show you how to use forms so that users can interact with your app and with each other. You’ll also make your Django front end look good by using the Bulma CSS framework.

In the first part, you extended the Django User model to add profile information that allows users to follow and unfollow each other. You also learned how to customize the Django admin interface and troubleshoot during development with the help of Django’s error messages.

In the second part of this tutorial series, you’ll learn how to:

  • Integrate Bulma CSS and style your app
  • Use template inheritance to reduce repetition
  • Structure Django templates in a folder hierarchy
  • Build routing and view functions
  • Interlink pages of your app using dynamic URLs

After finishing the second part of this project, you’ll move on to the third part of this tutorial series, where you’ll create the back end for adding content to your social network. You’ll also add the missing templates to allow your users to view the text-based content on their dashboard page.

You can download the code that you’ll need to start the second part of this project by clicking the link below and going to the source_code_start/ folder:

Get Source Code: Click here to get the source code for this part of building out your Django social network.

Demo

In this four-part tutorial series, you’re building a small social network that allows users to post short text-based messages. The users of your app can also follow other user profiles to see the posts of these users or unfollow them to stop seeing their text-based posts:

In the second part of this series, you’ll work with templates and learn to use the CSS framework Bulma to give your app a user-friendly appearance. You’ll also tackle common tasks such as setting up routing, views, and templates for individual user profile pages as well as interlinking them with the profile list page:

At the end of this part of the tutorial series, you’ll be able to access detail pages and the profile list page and navigate between them. You’ll also have Bulma added to style the pages.

Project Overview

In this section, you’ll get an overview of the topics that you’ll cover in this second part of the tutorial series. You’ll also get a chance to revisit the full project implementation steps, in case you need to skip back to a previous step from an earlier part of the series or if you want to see what’s still up ahead.

At this point, you should have finished working through part one of this tutorial series. If you did, then you’re ready to continue with your next steps, which focus on templates and front-end styling:

After completing all steps of this second part of the series, you can continue with part three.

To refresh your memory and get an overview of how you’ll work through all four parts of this series on building your Django social network, you can expand the collapsible section below:

You’re implementing the project in a number of steps spread out over multiple separate tutorials in this series. There’s a lot to cover, and you’re going into detail along the way:

✅ Part 1: Models and Relationships

  • Step 1: Set Up the Base Project
  • Step 2: Extend the Django User Model
  • Step 3: Implement a Post-Save Hook

📍 Part 2: Templates and Front-End Styling

  • Step 4: Create a Base Template With Bulma
  • Step 5: List All User Profiles
  • Step 6: Access Individual Profile Pages

⏭ Part 3: Follows and Dweets

  • Step 7: Follow and Unfollow Other Profiles
  • Step 8: Create the Back-End Logic For Dweets
  • Step 9: Display Dweets on the Front End

⏭ Part 4: Forms and Submissions

  • Step 10: Submit Dweets Through a Django Form
  • Step 11: Prevent Double Submissions and Handle Errors
  • Step 12: Improve the Front-End User Experience

Each of these steps will provide links to any necessary resources. By approaching the steps one at a time, you’ll have the opportunity to pause and come back at a later point in case you want to take a break.

With the high-level structure of this tutorial series in mind, you’ve got a good idea of where you’re at and which implementation steps you’ll handle in the later parts.

Before getting started with the next step, take a quick look at the prerequisites to skim any links to other resources that might be helpful along the way.

Prerequisites

To successfully work through this part of your project, you need to have completed the first part on models and relationships and you should confirm that your project is working as described there. It would be best if you’re also comfortable with the following concepts:

Read the full article at https://realpython.com/django-social-front-end-2/ »


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Planet Python

This is the way

Someone in Palm Beach tells New Yorkers leave if ‘woke’

Someone had a warning for New Yorkers visiting former President Donald Trump’s new hometown — leave if you are “woke.”

Palm Beach police are investigating after someone placed fliers over the weekend on New York-licensed cars parked in the wealthy island reading, “If you are one of the those ‘woke’ people — leave Florida. You will be happier elsewhere, as will we.”

Remember that I just recently wrote a post about some fucking guy who had the Progressive audacity to leave San Francisco,  because it is a failed city, for Miami but say that Miami is worse.

This is what Red Southern states have to deal with.

I’m learning all about what New Yorkers are doing to North Carolina.

Telling those fuckers to stay out with flyers is probably the nicest way Southerners have dealt with carpetbaggers.

Real Python: Deploy Your Python Script on the Web With Flask

https://files.realpython.com/media/Python-driven-Web-Applications_Watermarked.34e6451b18f1.jpg

You wrote a Python script that you’re proud of, and now you want to show it off to the world. But how? Most people won’t know what to do with your .py file. Converting your script into a Python web application is a great solution to make your code usable for a broad audience.

In this course, you’ll learn how to go from a local Python script to a fully deployed Flask web application that you can share with the world.

By the end of this course, you’ll know:

  • What web applications are and how you can host them online
  • How to convert a Python script into a Flask web application
  • How to improve user experience by adding HTML to your Python code
  • How to deploy your Python web application to Google App Engine

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Planet Python

Real Python: Deploy Your Python Script on the Web With Flask

https://files.realpython.com/media/Python-driven-Web-Applications_Watermarked.34e6451b18f1.jpg

You wrote a Python script that you’re proud of, and now you want to show it off to the world. But how? Most people won’t know what to do with your .py file. Converting your script into a Python web application is a great solution to make your code usable for a broad audience.

In this course, you’ll learn how to go from a local Python script to a fully deployed Flask web application that you can share with the world.

By the end of this course, you’ll know:

  • What web applications are and how you can host them online
  • How to convert a Python script into a Flask web application
  • How to improve user experience by adding HTML to your Python code
  • How to deploy your Python web application to Google App Engine

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Planet Python

Backup Performance Comparison: mysqldump vs MySQL Shell Utilities vs mydumper vs mysqlpump vs XtraBackup

https://www.percona.com/blog/wp-content/uploads/2021/12/MySQL-Backup-Performance-Comparison.pngMySQL Backup Performance Comparison

MySQL Backup Performance ComparisonIn this blog post, we will compare the performance of performing a backup from a MySQL database using mysqldump, MySQL Shell feature called Instance Dump, mysqlpump, mydumper, and Percona XtraBackup. All these available options are open source and free to use for the entire community.

To start, let’s see the results of the test.

Benchmark Results

The benchmark was run on an m5dn.8xlarge instance, with 128GB RAM, 32 vCPU, and 2xNVMe disks of 600GB (one for backup and the other one for MySQL data). The MySQL version was 8.0.26 and configured with 89Gb of buffer pool, 20Gb of redo log, and a sample database of 177 GB (more details below).

We can observe the results in the chart below:

MySQL Backup Results

And if we analyze the chart only for the multi-threaded options:

multi-threaded options

As we can see, for each software, I’ve run each command three times in order to experiment using 16, 32, and 64 threads. The exception for this is mysqldump, which does not have a parallel option and only runs in a single-threaded mode.

We can observe interesting outcomes:

  1. When using zstd compression, mydumper really shines in terms of performance. This option was added not long ago (MyDumper 0.11.3).
  2. When mydumper is using gzip, MySQL Shell is the fastest backup option.
  3. In 3rd we have Percona XtraBackup.
  4. mysqlpump is the 4th fastest followed closer by mydumper when using gzip.
  5. mysqldump is the classic old-school style to perform dumps and is the slowest of the four tools.
  6. In a server with more CPUs, the potential parallelism increases, giving even more advantage to the tools that can benefit from multiple threads.

Hardware and Software Specs

These are the specs of the benchmark:

  • 32 CPUs
  • 128GB Memory
  • 2x NVMe disks 600 GB
  • Centos 7.9
  • MySQL 8.0.26
  • MySQL shell 8.0.26
  • mydumper 0.11.5 – gzip
  • mydumper 0.11.5 – zstd
  • Xtrabackup 8.0.26

The my.cnf configuration:

[mysqld]
innodb_buffer_pool_size = 89G
innodb_log_file_size = 10G

Performance Test

For the test, I used sysbench to populate MySQL. To load the data, I choose the tpcc method:

$ ./tpcc.lua  --mysql-user=sysbench --mysql-password='sysbench' --mysql-db=percona --time=300 --threads=64 --report-interval=1 --tables=10 --scale=100 --db-driver=mysql prepare

Before starting the comparison, I ran mysqldump once and discarded the results to warm up the cache, otherwise our test would be biased because the first backup would have to fetch data from the disk and not the cache.

With everything set, I started the mysqldump with the following options:

$ time mysqldump --all-databases --max-allowed-packet=4294967295 --single-transaction -R --master-data=2 --flush-logs | gzip > /backup/dump.dmp.gz

For the Shell utility:

$ mysqlsh
MySQL JS > shell.connect('root@localhost:3306');
MySQL localhost:3306 ssl test JS > util.dumpInstance("/backup", {ocimds: true, compatibility: ["strip_restricted_grants","ignore_missing_pks"],threads: 16})

For mydumper:

$ time mydumper  --threads=16  --trx-consistency-only --events --routines --triggers --compress --outputdir /backup/ --logfile /backup/log.out --verbose=2

PS: To use zstd, there are no changes in the command line, but you need to download the zstd binaries.

For mysqlpump:

$ time mysqlpump --default-parallelism=16 --all-databases > backup.out

For xtrabackup:

$ time xtrabackup --backup --parallel=16 --compress --compress-threads=16 --datadir=/mysql_data/ --target-dir=/backup/

Analyzing the Results

And what do the results tell us?

Parallel methods have similar performance throughput. The mydumper tool cut the execution time by 50% when using zstd instead of gzip, so the compression method makes a big difference when using mydumper.

For the util.dumpInstance utility, one advantage is that the tool stores data in both binary and text format and uses zstd compression by default. Like mydumper, it uses multiple files to store the data and has a good compression ratio. 

XtraBackup got third place with a few seconds of difference from MySQL shell. The main advantage of XtraBackup is its flexibility, providing PITR and encryption for example. 

Next, mysqlpump is more efficient than mydumper with gzip, but only by a small margin. Both are logical backup methods and works in the same way. I tested mysqlpump with zstd compression, but the results were the same, hence the reason I didn’t add it to the chart. One possibility is because mysqlpump streams the data to a single file.

Lastly, for mysqldump, we can say that it has the most predictable behavior and has similar execution times with different runs. The lack of parallelism and compression is a disadvantage for mysqldump; however, since it was present in the earliest MySQL versions, based on Percona cases, it is still used as a logical backup method.

Please leave in the comments below what you thought about this blog post, if I missed something, or if it helped you. I will be glad to discuss it!

Useful Resources

Finally, you can reach us through the social networks, our forum, or access our material using the links presented below:

Planet MySQL

How to avoid achy feet while working at home

https://www.futurity.org/wp/wp-content/uploads/2022/01/feet-covid-19-working-from-home-slippers-1600.jpgA woman wears white unicorn slippers under her desk

If you’ve been spending more time at home during the pandemic, you may not realize how important proper footwear is for keeping certain injuries at bay.

Sean Peden, an orthopaedic foot and ankle specialist from Yale University Medicine, says not wearing supportive footwear on a regular basis can lead to foot pain and other problems.

“Many people are continuing to work at home part- or full-time, which for some can mean wearing slippers or walking around barefoot,” Peden says. “And because of that, many patients are coming to us with foot problems.”

Taking good care of your feet will not only help you avoid common injuries like tendonitis and plantar fasciitis, but it can also prevent other issues with your hips, knees, and back from developing, he adds.

Here, Peden shares some of the most common foot problems he sees—and simple treatments to get relief:

Think of shoes as shock absorbers

Just as you would pick out an appropriate shoe for your commute into the office, it’s important to put the same level of thought into selecting an at-home shoe.

Walking barefoot at home is not recommended for the same reason walking barefoot outside is ill-advised, Peden says.

“All kinds of footwear protect your feet. Over the course of weeks or months, the strain of walking barefoot can add significant stress to your arches, tendons, plantar fascia, and joints,” he says. “This can lead to a range of complications, from minor conditions such as calluses to major issues such as arch collapse.”

It may help to think of footwear as shock absorbers and, based on body type and gait, some of us need more shock absorption than others, Peden says.

“If you have sore feet—or have had foot problems in the past—wearing a pair of what I call ‘house shoes,’ or ‘house slippers,’ is a good idea.”

By that, Peden means a hard-soled, slip-on shoe or slipper that is worn exclusively inside the home (ideally) to avoid bringing in dirt or bacteria.

“To be practical, I suggest a slip-on clog or slipper without laces. That way, you don’t have to tie and untie your shoes 10 times a day,” Peden says. “A hard sole is important because the harder the sole, the less stress the joints and tendons in your foot experience with each step. The hard sole transfers that stress to the shoe rather than to the foot.”

In general, avoid fluffy, formless slippers, he advises. “If you are at home, you might go up and down stairs dozens of times a day—or do chores around the house. And those are not activities to do with footwear that doesn’t have any support,” Peden says. “A good rule of thumb is if it isn’t something you could walk in for a few blocks comfortably, you shouldn’t wear it around the house all day, either.”

Painful tendonitis

One of the most common foot problems Peden has seen in patients since the pandemic started is Achilles tendonitis, or inflammation of a tendon (a thick cord of tissue that connects muscles to bones). The Achilles tendon runs from the back of your calf to your heel bone. Achilles tendonitis can cause pain and swelling in the foot and ankle.

An injury, overuse, and flat fleet are all causes for Achilles tendonitis, Peden says. “It can be an issue especially if people with flat feet spend six months to a year not wearing supportive shoes on a regular basis,” he says. “The tendon in the arch of the foot becomes inflamed as the foot gets flatter. It is quite painful and can be debilitating.”

Peden says he is also seeing more patients with posterior tibial tendonitis, which causes a collapsed arch or flat foot.

The Fix: For acute pain, the first things to try are rest, ice, and staying off your feet as much as possible. Finding footwear with good arch support is another must, Peden says.

“Some people might need an ankle brace or additional inserts for their shoes, but for the vast majority, proper footwear is the answer. These tendon flares generally last a few months, but patients usually see improvement within a week or two.”

People with tendon issues should get proper treatment, Peden says. “You want to avoid developing a chronic tendon issue, because those are harder to cure.”

Plantar fasciitis: ‘stabbing’ heal pain

Many patients have developed plantar fasciitis, inflammation of the band of tissue on the bottom of your foot.

A common symptom is a stabbing pain in the heel that can be the most intense when you first step out of bed in the morning. That’s because the plantar fascia, which runs from the heel to the base of your toes, tightens overnight.

The plantar fascia supports the arch of the foot and absorbs stress. Too much stress—from standing on your feet on a hard surface for a long time, improper shoes, or running—can cause irritation and tiny tears in the band of tissue.

“The pain is usually on the bottom part of the heel,” Peden says. “It’s associated with tight Achilles tendons and calf muscles. If people spend a lot of their day sitting, for example, the muscles can tighten up, and wearing improper footwear can exacerbate the issue.

“For people who work outside the home and are on their feet all day, including nurses, they should wear a supportive shoe—and not something too soft or flexible. This can include sneakers, a hard clog, or a work shoe, depending on personal preference.”

The Fix: Besides supportive footwear and avoiding walking around barefoot, treatment should include a home stretching program to address the tightness in the calf muscles and Achilles tendons, Peden says.

Another effective treatment is to wear a soft, flexible splint that holds your foot at a 90-degree angle while you are sleeping; this keeps the plantar fascia stretched out. You can also wear a splint while lying on the couch watching TV.

As painful as plantar fasciitis can be, Peden says it is not a progressive condition. “People often worry it’s the start of something like arthritis, which continues to get worse,” he says. “It might take a few months of conservative, noninvasive, nonsurgical treatments, but patients with plantar fasciitis typically get better.”

Physical therapy and lifestyle

Exercise, physical therapy, and weight loss can all make a difference in addressing foot pain, too.

“One pound of additional weight on your body leads to six pounds of additional pressure on your foot. So, if you lose 10 pounds, that is really taking 60 pounds of pressure off your foot,” Peden says.

With the pandemic, many people have gained weight, which compounds the problem. But the key is not to do too much too quickly to try to reverse it, Peden says.

“If you try to lose weight by suddenly walking too much, that’s hard on your feet, too, and may lead to other foot problems. So, I often recommend cross-training, including low-impact cardio activities like biking or swimming. You can walk, but try to take it easy and, as always, wear good, supportive shoes.”

Hiking shoes are often a good option, particularly if you walk on uneven surfaces, including trails. “They are a little safer than sneakers, and protect your foot and ankle better,” he says.

In certain cases, physical therapy is recommended for lingering foot issues. “Physical therapists have many techniques that can speed up the recovery process,” Peden says.

Surgery is rarely needed for chronic conditions like tendonitis or plantar fasciitis. “We always treat our patients first with nonsurgical options to hopefully manage the condition before we ever talk about surgery,” Peden says.

But if you are feeling foot pain, don’t be afraid to seek medical help, Peden advises.

“I know people have different comfort levels right now about seeking medical care during the pandemic, but if you have a foot issue and it’s been hurting for a while, you should go see your doctor. There are likely easy solutions.”

Source: Yale University

The post How to avoid achy feet while working at home appeared first on Futurity.

Futurity

Pandas DataFrame Computations & Descriptive Stats – Part 4

http://img.youtube.com/vi/ytUj53rYCME/0.jpg

The Pandas DataFrame has several methods concerning Computations and Descriptive Stats. When applied to a DataFrame, these methods evaluate the elements and return the results.

  • Part 1 focuses on the DataFrame methods abs(), all(), any(), clip(), corr(), and corrwith().
  • Part 2 focuses on the DataFrame methods count(), cov(), cummax(), cummin(), cumprod(), cumsum().
  • Part 3 focuses on the DataFrame methods describe(), diff(), eval(), kurtosis().
  • Part 4 focuses on the DataFrame methods mad(), min(), max(), mean(), median(), and mode().

Getting Started

Remember to add the Required Starter Code to the top of each code snippet. This snippet will allow the code in this article to run error-free.

Required Starter Code

import pandas as pd
import numpy as np 

Before any data manipulation can occur, two new libraries will require installation.

  • The pandas library enables access to/from a DataFrame.
  • The numpy library supports multi-dimensional arrays and matrices in addition to a collection of mathematical functions.

To install these libraries, navigate to an IDE terminal. At the command prompt ($), execute the code below. For the terminal used in this example, the command prompt is a dollar sign ($). Your terminal prompt may be different.

$ pip install pandas

Hit the <Enter> key on the keyboard to start the installation process.

$ pip install numpy

Hit the <Enter> key on the keyboard to start the installation process.

Feel free to check out the correct ways of installing those libraries here:

If the installations were successful, a message displays in the terminal indicating the same.

DataFrame mad()

The mad() method (Mean Absolute Deviation) is the average distance of all DataFrame elements from the mean.

To fully understand MAD from a mathematical point of view, feel free to watch this short tutorial:

The syntax for this method is as follows:

DataFrame.mad(axis=None, skipna=None, level=None)
Parameter Description
axis If zero (0) or index, apply the function to each column. Default is None. If one (1) or column, apply the function to each row.
skipna If this parameter is True, any NaN/NULL value(s) ignored. If False, all value(s) included: valid or empty. If no value, then None is assumed.
level Set the appropriate parameter if the DataFrame/Series is multi-level. If no value, then None is assumed.

                             

This example retrieves the MAD of four (4) Hockey Teams.

df_teams = pd.DataFrame({'Bruins':   [4, 5, 9],
                         'Oilers':   [3, 6, 10],
                         'Leafs':    [2, 7, 11],
                         'Flames': [1, 8, 12]})

result = df_teams.mad(axis=0).apply(lambda x:round(x,3))
print(result)
  • Line [1] creates a DataFrame from a Dictionary of Lists and saves it to df_teams.
  • Line [2] uses the mad() method with the axis parameter set to columns to calculate MAD from the DataFrame. The lambda function formats the output to three (3) decimal places. This output saves to the result variable.
  • Line [3] outputs the result to the terminal.

Output:

Bruins 2.000
Oilers 2.444
Leafs 3.111
Flames 4.000
dtype: float64

DataFrame min()

The min() method returns the smallest value(s) from a DataFrame/Series. The following methods can accomplish this task:

The syntax for this method is as follows:

DataFrame.min(axis=None, skipna=None, level=None, numeric_only=None, **kwargs)
Parameter Description
axis If zero (0) or index, apply the function to each column. Default is None. If one (1) or column, apply the function to each row.
skipna If this parameter is True, any NaN/NULL value(s) ignored. If False, all value(s) included: valid or empty. If no value, then None is assumed.
level Set the appropriate parameter if the DataFrame/Series is multi-level. If no value, then None is assumed.
numeric_only Only include columns that contain integers, floats, or boolean values.
**kwargs This is where you can add additional keywords.

For this example, we will determine which Team(s) have the smallest amounts of wins, losses, or ties.

Code Example 1:

df_teams = pd.DataFrame({'Bruins':   [4, 5,  9],
                         'Oilers':    [3, 6, 14],
                         'Leafs':     [2, 7, 11],
                         'Flames':  [21, 8, 7]})

result = df_teams.min(axis=0)
print(result)
  • Line [1] creates a DataFrame from a dictionary of lists and saves it to df_teams.
  • Line [2] uses the min() method with the axis parameter set to columns to retrieve the minimum value(s) from the DataFrame. This output saves to the result variable.
  • Line [3] outputs the result to the terminal.

Output:

Bruins 4
Oilers 3
Leafs 2
Flames 8
dtype: int64

This example uses two (2) arrays and retrieves the minimum value(s) of the Series.

Code Example 2:

c11_grades = [63, 78, 83, 93]
c12_grades = [73, 84, 79, 83]

result = np.minimum(c11_grades, c12_grades)
print(result)
  • Line [1-2] create lists of random grades and assigns them to the appropriate variable.
  • Line [3] uses NumPy minimum to compare the two (2) arrays. This output saves to the result variable.
  • Line [4] outputs the result to the terminal.

Output:

[63 78 79 83]

DataFrame max()

The max() method returns the largest value(s) from a DataFrame/Series. The following methods can accomplish this task:

The syntax for this method is as follows:

DataFrame.max(axis=None, skipna=None, level=None, numeric_only=None, **kwargs)
Parameter Description
axis If zero (0) or index, apply the function to each column. Default is None. If one (1) or column, apply the function to each row.
skipna If this parameter is True, any NaN/NULL value(s) ignored. If False, all value(s) included: valid or empty. If no value, then None is assumed.
level Set the appropriate parameter if the DataFrame/Series is multi-level. If no value, then None is assumed.
numeric_only Only include columns that contain integers, floats, or boolean values.
**kwargs This is where you can add additional keywords.

For this example, we will determine which Team(s) have the largest amounts of wins, losses, or ties.

Code Example 1:

df_teams = pd.DataFrame({'Bruins':   [4, 5,  9],
                         'Oilers':    [3, 6, 14],
                         'Leafs':     [2, 7, 11],
                         'Flames':  [21, 8, 7]})

result = df_teams.max(axis=0)
print(result)
  • Line [1] creates a DataFrame from a Dictionary of Lists and saves it to df_teams.
  • Line [2] uses max() with the axis parameter set to columns to retrieve the maximum value(s) from the DataFrame. This output saves to the result variable.
  • Line [3] outputs the result to the terminal.

Output:

Bruins 9
Oilers 14
Leafs 11
Flames 21
dtype: int64

This example uses two (2) arrays and retrieves the maximum value(s) of the Series.

Code Example 2:

c11_grades = [63, 78, 83, 93]
c12_grades = [73, 84, 79, 83]

result = np.maximum(c11_grades, c12_grades)
print(result)
  • Line [1-2] create lists of random grades and assigns them to the appropriate variable.
  • Line [3] uses the NumPy library maximum function to compare the two (2) arrays. This output saves to the result variable.
  • Line [4] outputs the result to the terminal.

Output:

[73 84 83 93]

DataFrame mean()

The mean() method returns the average of the DataFrame/Series across a requested axis. If a DataFrame is used, the results will return a Series. If a Series is used, the result will return a single number (float).

The following methods can accomplish this task:

  • The DataFrame.mean() method, or
  •  The Series.mean() method

The syntax for this method is as follows:

DataFrame.mean(axis=None, skipna=None, level=None, numeric_only=None, **kwargs)
Parameter Description
axis If zero (0) or index, apply the function to each column. Default is None. If one (1) or column, apply the function to each row.
skipna If this parameter is True, any NaN/NULL value(s) ignored. If False, all value(s) included: valid or empty. If no value, then None is assumed.
level Set the appropriate parameter if the DataFrame/Series is multi-level. If no value, then None is assumed.
numeric_only Only include columns that contain integers, floats, or boolean values.
**kwargs This is where you can add additional keywords.

For this example, we will determine average wins, losses and ties for our Hockey Teams.

Code Example 1:

df_teams = pd.DataFrame({'Bruins':   [4, 5,  9],
                         'Oilers':    [3, 6, 14],
                         'Leafs':     [2, 7, 11],
                         'Flames':  [21, 8, 7]})

result = df_teams.mean(axis=0).apply(lambda x:round(x,2))
print(result)
  • Line [1] creates a DataFrame from a Dictionary of Lists and saves it to df_teams.
  • Line [2] uses the mean() method with the axis parameter set to columns to calculate means (averages) from the DataFrame. The lambda function formats the output to two (2) decimal places. This output saves to the result variable.
  • Line [3] outputs the result to the terminal.

Output:

Bruins 6.00
Oilers 7.67
Leafs 6.67
Flames 12.00
dtype: float64

For this example, Alice Accord, an employee of Rivers Clothing has logged her hours for the week. Let’s calculate the mean (average) hours worked per day.

Code Example 2:

hours  = pd.Series([40.5, 37.5, 40, 55])
result = hours.mean()
print(result)
  • Line [1] creates a Series of hours worked for the week and saves to hours.
  • Line [2] uses the mean() method to calculate the mean (average). This output saves to the result variable.
  • Line [3] outputs the result to the terminal.

Output:

42.25

DataFrame median()

The median() method calculates and returns the median of DataFrame/Series elements across a requested axis. In other words, the median determines the middle number(s) of the dataset.

To fully understand median from a mathematical point of view, watch this short tutorial:

The syntax for this method is as follows:

DataFrame.median(axis=None, skipna=None, level=None, numeric_only=None, **kwargs)

Parameters

Axis:                      If zero (0) or index, apply the function to each column. Default is None.

                              If one (1) or column, apply the function to each row.

Skipna:                 If this parameter is True, any NaN/NULL value(s) ignored.

                              If False, all value(s) included: valid or empty.                    

If no value, then None is assumed.

Level:                   Set the appropriate parameter if the DataFrame/Series is multi-level.

                              If no value, then None is assumed.

Numeric_only: Only include columns that contain integers, floats, or boolean values.

**kwargs:           This is where you can add additional keywords.

For this example, we will determine the median value(2) for our Hockey Teams.

df_teams = pd.DataFrame({‘Bruins’:   [4, 5,  9],

                                                  ‘Oilers’:    [3, 6, 14],

                                                  ‘Leafs’:     [2, 7, 11],

                                                  ‘Flames’:  [21, 8, 7]})

result = df_teams.median(axis=0)

print(result)

Line [1] creates a DataFrame from a Dictionary of Lists and saves it to df_teams.

Line [2] uses the median() method to calculate the median of the Teams. This output saves to the result variable.

Line [3] outputs the result to the terminal.

Output

Bruins 5.0
Oilers 6.0
Leafs 7.0
Flames 8.0
dtype: float64

DataFrame Mode

The mode() method determines the most commonly used numbers in a DataFrame/Series.

The syntax for this method is as follows:

DataFrame.mode(axis=0, numeric_only=False, dropna=True)

Parameters

Axis:                      If zero (0) or index, apply the function to each column. Default is None.

                              If one (1) or column, apply the function to each row.

Numeric_only: Only include columns that contain integers, floats, or boolean values.

Dropna:               If set to True, this parameter ignores all NaN and NaT values. By default, this value is True.

For this example, we determine the numbers that appear more than once.

Code

df_teams = pd.DataFrame({‘Bruins’:   [4, 5,  9],

                                                  ‘Oilers’:    [3, 9, 13],

                                                  ‘Leafs’:     [2, 7, 4],

                                                  ‘Flames’:  [13, 9, 7]})

result = df_teams.mode(axis=0)

print(result)

Line [1] creates a DataFrame from a Dictionary of Lists and saves it to df_teams.

Line [2] uses the mode() method across the column axis. This output saves to the result variable.

Line [3] outputs the result to the terminal.

Output

  Bruins Oilers Leafs Flames
0 4 3 2 7
1 5 9 4 9
2 9 13 7 13

Finxter

Pandas DataFrame Computations & Descriptive Stats – Part 5

https://s.w.org/images/core/emoji/13.1.0/72×72/1f4a1.png

The Pandas DataFrame has several methods concerning Computations and Descriptive Stats. When applied to a DataFrame, these methods evaluate the elements and return the results.

  • Part 1 focuses on the DataFrame methods abs(), all(), any(), clip(), corr(), and corrwith().
  • Part 2 focuses on the DataFrame methods count(), cov(), cummax(), cummin(), cumprod(), cumsum().
  • Part 3 focuses on the DataFrame methods describe(), diff(), eval(), kurtosis().
  • Part 4 focuses on the DataFrame methods mad(), min(), max(), mean(), median(), and mode().
  • Part 5 focuses on the DataFrame methods pct_change(), quantile(), rank(), round(), prod(), and product().

Getting Started

Remember to add the Required Starter Code to the top of each code snippet. This snippet will allow the code in this article to run error-free.

Required Starter Code

import pandas as pd
import numpy as np 

Before any data manipulation can occur, two new libraries will require installation.

  • The pandas library enables access to/from a DataFrame.
  • The numpy library supports multi-dimensional arrays and matrices in addition to a collection of mathematical functions.

To install these libraries, navigate to an IDE terminal. At the command prompt ($), execute the code below. For the terminal used in this example, the command prompt is a dollar sign ($). Your terminal prompt may be different.

$ pip install pandas

Hit the <Enter> key on the keyboard to start the installation process.

$ pip install numpy

Hit the <Enter> key on the keyboard to start the installation process.

Feel free to check out the correct ways of installing those libraries here:

If the installations were successful, a message displays in the terminal indicating the same.

DataFrame pct_change()

The pct_change() method calculates and returns the percentage change between the current and prior element(s) in a DataFrame. The return value is the caller.

To fully understand this method and other methods in this tutorial from a mathematical point of view, feel free to watch this short tutorial:

The syntax for this method is as follows:

DataFrame.pct_change(periods=1, fill_method='pad', limit=None, freq=None, **kwargs)
Parameter Description
periods This sets the period(s) to calculate the percentage change.
fill_method This determines what value NaN contains.
limit This sets how many NaN values to fill in the DataFrame before stopping.
freq Used for a specified time series.
**kwargs Additional keywords passed into a DataFrame/Series.

This example calculates and returns the percentage change of four (4) fictitious stocks over three (3) months.

df = pd.DataFrame({'ASL':  [18.93, 17.03, 14.87],
                   'DBL':   [39.91, 41.46, 40.99],
                   'UXL':   [44.01, 43.67, 41.98]},
                   index= ['2021-10-01', '2021-11-01', '2021-12-01'])

result = df.pct_change(axis='rows', periods=1)
print(result)
  • Line [1] creates a DataFrame from a dictionary of lists and saves it to df.
  • Line [2] uses the pc_change() method with a selected axis and period to calculate the change. This output saves to the result variable.
  • Line [3] outputs the result to the terminal.

Output:

  ASL DBL UXL
2021-10-01 NaN NaN NaN
2021-11-01 -0.100370 0.038837 -0.007726
2021-12-01 -0.126835 -0.011336 -0.038699

💡 Note: The first line contains NaN values as there is no previous row.

DataFrame quantile()

The quantile() method returns the values from a DataFrame/Series at the specified quantile and axis.

The syntax for this method is as follows:

DataFrame.quantile(q=0.5, axis=0, numeric_only=True, interpolation='linear')
Parameter Description
q This is a value 0 <= q <= 1 and is the quantile(s) to calculate.
axis If zero (0) or index, apply the function to each column. Default is None. If one (1) or column, apply the function to each row.
numeric_only Only include columns that contain integers, floats, or boolean values.
interpolation Calculates the estimated median or quartiles for the DataFrame/Series.

To fully understand the interpolation parameter from a mathematical point of view, feel free to check out this tutorial:

This example uses the same stock DataFrame as noted above to determine the quantile(s).

df = pd.DataFrame({'ASL':  [18.93, 17.03, 14.87],
                   'DBL':   [39.91, 41.46, 40.99],
                   'UXL':   [44.01, 43.67, 41.98]})

result = df.quantile(0.15)
print(result)
  • Line [1] creates a DataFrame from a dictionary of lists and saves it to df.
  • Line [2] uses the quantile() method to calculate by setting the q (quantile) parameter to 0.15. This output saves to the result variable.
  • Line [3] outputs the result to the terminal.

Output:

ASL 15.518
DBL 40.234
USL 42.487
Name: 0.15, dtype: float64  

DataFrame rank()

The rank() method returns a DataFrame/Series with the values ranked in order. The return value is the same as the caller.

The syntax for this method is as follows:

DataFrame.rank(axis=0, method='average', numeric_only=None, na_option='keep', ascending=True, pct=False)
Parameter Description
axis If zero (0) or index, apply the function to each column. Default is None. If one (1) or column, apply the function to each row.
method Determines how to rank identical values, such as:
– The average rank of the group.
– The lowest (min) rank value of the group.
– The highest (max) rank value of the group.
– Each assigns in the same order they appear in the array.
– Density increases by one (1) between the groups.
numeric_only Only include columns that contain integers, floats, or boolean values.
na_option Determines how NaN values rank, such as:
– Keep assigns a NaN to the rank values.
– Top: The lowest rank to any NaN values found.
– Bottom: The highest to any NaN values found.
ascending Determines if the elements/values rank in ascending or descending order.
pct If set to True, the results will return in percentile form. By default, this value is False.

For this example, a CSV file is read in and is ranked on Population and sorted. Click here to download and move this file to the current working directory.

df = pd.read_csv("countries.csv")
df["Rank"] = df["Population"].rank()
df.sort_values("Population", inplace=True)
print(df)
  • Line [1] reads in the countries.csv file and saves it to df.
  • Line [2] appends a column to the end of the DataFrame (df). 
  • Line [3] sorts the CSV file in ascending order.
  • Line [4] outputs the result to the terminal.

Output:

  Country Capital Population Area Rank
4 Poland Warsaw 38383000 312685 1.0
2 Spain Madrid 47431256 498511 2.0
3 Italy Rome 60317116 301338 3.0
1 France Paris 67081000 551695 4.0
0 Germany Berlin 83783942 357021 5.0
5 Russia Moscow 146748590 17098246 6.0
6 USA Washington 328239523 9833520 7.0
8 India Dheli 1352642280 3287263 8.0
7 China Beijing 1400050000 9596961 9.0

DataFrame round()

The round() method rounds the DataFrame output to a specified number of decimal places.

The syntax for this method is as follows:

DataFrame.round(decimals=0, *args, **kwargs)
Parameter Description
decimals Determines the specified number of decimal places to round the value(s).
*args Additional keywords passed into a DataFrame/Series.
**kwargs Additional keywords passed into a DataFrame/Series.

For this example, the Bank of Canada’s mortgage rates over three (3) months display and round to three (3) decimal places.

Code Example 1:

df = pd.DataFrame([(2.3455, 1.7487, 2.198)], columns=['Month 1', 'Month 2', 'Month 3']) 
result = df.round(3)
print(result)
  • Line [1] creates a DataFrame complete with column names and saves to df.
  • Line [2] rounds the mortgage rates to three (3) decimal places. This output saves to the result variable.
  • Line [3] outputs the result to the terminal.

Output:

  Month 1 Month 2 Month 3
0 2.346 1.749 2.198

Another way to perform the same task is with a Lambda!

Code Example 2:

df = pd.DataFrame([(2.3455, 1.7487, 2.198)], 
                  columns=['Month 1', 'Month 2', 'Month 3']) 
result = df.apply(lambda x: round(x, 3))
print(result)
  • Line [1] creates a DataFrame complete with column names and saves to df.
  • Line [2] rounds the mortgage rates to three (3) decimal places using a Lambda. This output saves to the result variable.
  • Line [3] outputs the result to the terminal.

💡 Note: The output is identical to that of the above.

DataFrame Prod and Product

The prod() and product() methods are identical.  Both return the product of the values of a requested axis.

The syntax for these methods is as follows:

DataFrame.prod(axis=None, skipna=None, level=None, numeric_only=None, min_count=0, **kwargs)

DataFrame.product(axis=None, skipna=None, level=None, numeric_only=None, min_count=0, **kwargs)

Parameters:

Axis:                      If zero (0) or index, apply the function to each column. Default is None.

                              If one (1) or column, apply the function to each row.

Skip_na:               If set to True, this parameter excludes NaN/NULL values when calculating the result.

Level:                   Set the appropriate parameter if the DataFrame/Series is multi-level.

                              If no value, then None is assumed.

Numeric_only: Only include columns that contain integers, floats, or boolean values.

Min_count:         The number of values on which to perform the calculation.

**kwargs:           Additional keywords passed into a DataFrame/Series.

For this example, random numbers generate and the product on the selected axis returns.

Code:

df = pd.DataFrame({‘A’:   [2, 4, 6],

                                    ‘B’:   [7, 3, 5],

                                   ‘C’:   [6, 3, 1]})

index_ = [‘A’, ‘B’, ‘C’]

df.index = index_

result = df.prod(axis=0)

print(result)

Line [1] creates a DataFrame complete with random numbers and saves to df.

Line [2-3] creates and sets the DataFrame index.

Line [3] calculates the product along axis 0. This output saves to the result variable.

Line [4] outputs the result to the terminal.

Output:

Formula Example: 2*4*6=48

A 48
B 105
C 18
dtype: int64

Finxter