Python Converting List of Strings to * [Ultimate Guide]

https://s.w.org/images/core/emoji/14.0.0/72×72/1f447.png

5/5 – (1 vote)

Since I frequently handle textual data with Python ????, I’ve encountered the challenge of converting lists of strings into different data types time and again. This article, originally penned for my own reference, decisively tackles this issue and might just prove useful for you too!

Let’s get started! ????

Python Convert List of Strings to Ints

This section is for you if you have a list of strings representing numbers and want to convert them to integers.

The first approach is using a for loop to iterate through the list and convert each string to an integer using the int() function.

Here’s a code snippet to help you understand:

string_list = ['1', '2', '3']
int_list = []

for item in string_list:
    int_list.append(int(item))

print(int_list)  # Output: [1, 2, 3]

Another popular method is using list comprehension. It’s a more concise way of achieving the same result as the for loop method.

Here’s an example:

string_list = ['1', '2', '3']
int_list = [int(item) for item in string_list]
print(int_list)  # Output: [1, 2, 3]

You can also use the built-in map() function, which applies a specified function (in this case, int()) to each item in the input list. Just make sure to convert the result back to a list using list().

Take a look at this example:

string_list = ['1', '2', '3']
int_list = list(map(int, string_list))
print(int_list)  # Output: [1, 2, 3]

For a full guide on the matter, check out our blog tutorial:

???? Recommended: How to Convert a String List to an Integer List in Python

Python Convert List Of Strings To Floats

If you want to convert a list of strings to floats in Python, you’ve come to the right place. Next, let’s explore a few different ways you can achieve this. ????

First, one simple and Pythonic way to convert a list of strings to a list of floats is by using list comprehension.

Here’s how you can do it:

strings = ["1.2", "2.3", "3.4"]
floats = [float(x) for x in strings]

In this example, the list comprehension iterates over each element in the strings list, converting each element to a float using the built-in float() function. ????

Another approach is to use the map() function along with float() to achieve the same result:

strings = ["1.2", "2.3", "3.4"]
floats = list(map(float, strings))

The map() function applies the float() function to each element in the strings list, and then we convert the result back to a list using the list() function. ????

If your strings contain decimal separators other than the dot (.), like a comma (,), you need to replace them first before converting to floats:

strings = ["1,2", "2,3", "3,4"]
floats = [float(x.replace(',', '.')) for x in strings]

This will ensure that the values are correctly converted to float numbers. ????

???? Recommended: How to Convert a String List to a Float List in Python

Python Convert List Of Strings To String

You might need to convert a list of strings into a single string in Python. ???? It’s quite simple! You can use the join() method to combine the elements of your list.

Here’s a quick example:

string_list = ['hello', 'world']
result = ''.join(string_list)  # Output: 'helloworld'

You might want to separate the elements with a specific character or pattern, like spaces or commas. Just modify the string used in the join() method:

result_with_spaces = ' '.join(string_list)  # Output: 'hello world'
result_with_commas = ', '.join(string_list)  # Output: 'hello, world'

If your list contains non-string elements such as integers or floats, you’ll need to convert them to strings first using a list comprehension or a map() function:

integer_list = [1, 2, 3]

# Using list comprehension
str_list = [str(x) for x in integer_list]
result = ','.join(str_list)  # Output: '1,2,3'

# Using map function
str_list = map(str, integer_list)
result = ','.join(str_list)  # Output: '1,2,3'

Play around with different separators and methods to find the best suits your needs.

Python Convert List Of Strings To One String

Are you looking for a simple way to convert a list of strings to a single string in Python?

The easiest method to combine a list of strings into one string uses the join() method. Just pass the list of strings as an argument to join(), and it’ll do the magic for you.

Here’s an example:

list_of_strings = ["John", "Charles", "Smith"]
combined_string = " ".join(list_of_strings)
print(combined_string)

Output:

John Charles Smith

You can also change the separator by modifying the string before the join() call. Now let’s say your list has a mix of data types, like integers and strings. No problem! Use the map() function along with join() to handle this situation:

list_of_strings = ["John", 42, "Smith"]
combined_string = " ".join(map(str, list_of_strings))
print(combined_string)

Output:

John 42 Smith

In this case, the map() function converts every element in the list to a string before joining them.

Another solution is using the str.format() method to merge the list elements. This is especially handy when you want to follow a specific template.

For example:

list_of_strings = ["John", "Charles", "Smith"]
result = " {} {} {}".format(*list_of_strings)
print(result)

Output:

John Charles Smith

And that’s it! ???? Now you know multiple ways to convert a list of strings into one string in Python.

Python Convert List of Strings to Comma Separated String

So you’d like to convert a list of strings to a comma-separated string using Python.

Here’s a simple solution that uses the join() function:

string_list = ['apple', 'banana', 'cherry']
comma_separated_string = ','.join(string_list)
print(comma_separated_string)

This code would output:

apple,banana,cherry

Using the join() function is a fantastic and efficient way to concatenate strings in a list, adding your desired delimiter (in this case, a comma) between every element ????.

In case your list doesn’t only contain strings, don’t sweat! You can still convert it to a comma-separated string, even if it includes integers or other types. Just use list comprehension along with the str() function:

mixed_list = ['apple', 42, 'cherry']
comma_separated_string = ','.join(str(item) for item in mixed_list)
print(comma_separated_string)

And your output would look like:

apple,42,cherry

Now you have a versatile method to handle lists containing different types of elements ????

Remember, if your list includes strings containing commas, you might want to choose a different delimiter or use quotes to better differentiate between items.

For example:

list_with_commas = ['apple,green', 'banana,yellow', 'cherry,red']
comma_separated_string = '"{}"'.format('", "'.join(list_with_commas))
print(comma_separated_string)

Here’s the output you’d get:

"apple,green", "banana,yellow", "cherry,red"

With these tips and examples, you should be able to easily convert a list of strings (or mixed data types) to comma-separated strings in Python ????.

Python Convert List Of Strings To Lowercase

Let’s dive into converting a list of strings to lowercase in Python. In this section, you’ll learn three handy methods to achieve this. Don’t worry, they’re easy!

Solution: List Comprehension

Firstly, you can use list comprehension to create a list with all lowercase strings. This is a concise and efficient way to achieve your goal.

Here’s an example:

original_list = ["Hello", "WORLD", "PyThon"]
lowercase_list = [item.lower() for item in original_list]
print(lowercase_list)  # Output: ['hello', 'world', 'python']

With this approach, the lower() method is applied to each item in the list, creating a new list with lowercase strings. ????

Solution: map() Function

Another way to convert a list of strings to lowercase is by using the map() function. This function applies a given function (in our case, str.lower()) to each item in a list.

Here’s an example:

original_list = ["Hello", "WORLD", "PyThon"]
lowercase_list = list(map(str.lower, original_list))
print(lowercase_list)  # Output: ['hello', 'world', 'python']

Remember to wrap the map() function with the list() function to get your desired output. ????

Solution: For Loop

Lastly, you can use a simple for loop. This approach might be more familiar and readable to some, but it’s typically less efficient than the other methods mentioned.

Here’s an example:

original_list = ["Hello", "WORLD", "PyThon"]
lowercase_list = []

for item in original_list:
    lowercase_list.append(item.lower())

print(lowercase_list)  # Output: ['hello', 'world', 'python']

I have written a complete guide on this on the Finxter blog. Check it out! ????

???? Recommended: Python Convert String List to Lowercase

Python Convert List of Strings to Datetime

In this section, we’ll guide you through converting a list of strings to datetime objects in Python. It’s a common task when working with date-related data, and can be quite easy to achieve with the right tools!

So, let’s say you have a list of strings representing dates, and you want to convert this into a list of datetime objects. First, you’ll need to import the datetime module to access the essential functions. ????

from datetime import datetime

Next, you can use the strptime() function from the datetime module to convert each string in your list to a datetime object. To do this, simply iterate over the list of strings and apply the strptime function with the appropriate date format.

For example, if your list contained dates in the "YYYY-MM-DD" format, your code would look like this:

date_strings_list = ["2023-05-01", "2023-05-02", "2023-05-03"]
date_format = "%Y-%m-%d"
datetime_list = [datetime.strptime(date_string, date_format) for date_string in date_strings_list]

By using list comprehension, you’ve efficiently transformed your list of strings into a list of datetime objects! ????

Keep in mind that you’ll need to adjust the date_format variable according to the format of the dates in your list of strings. Here are some common date format codes you might need:

  • %Y: Year with century, as a decimal number (e.g., 2023)
  • %m: Month as a zero-padded decimal number (e.g., 05)
  • %d: Day of the month as a zero-padded decimal number (e.g., 01)
  • %H: Hour (24-hour clock) as a zero-padded decimal number (e.g., 17)
  • %M: Minute as a zero-padded decimal number (e.g., 43)
  • %S: Second as a zero-padded decimal number (e.g., 08)

Python Convert List Of Strings To Bytes

So you want to convert a list of strings to bytes in Python? No worries, I’ve got your back. ???? This brief section will guide you through the process.

First things first, serialize your list of strings as a JSON string, and then convert it to bytes. You can easily do this using Python’s built-in json module.

Here’s a quick example:

import json

your_list = ['hello', 'world']
list_str = json.dumps(your_list)
list_bytes = list_str.encode('utf-8')

Now, list_bytes is the byte representation of your original list. ????

But hey, what if you want to get back the original list from those bytes? Simple! Just do the reverse:

reconstructed_list = json.loads(list_bytes.decode('utf-8'))

And voilà! You’ve successfully converted a list of strings to bytes and back again in Python. ????

Remember that this method works well for lists containing strings. If your list includes other data types, you may need to convert them to strings first.

Python Convert List of Strings to Dictionary

Next, you’ll learn how to convert a list of strings to a dictionary. This can come in handy when you want to extract meaningful data from a list of key-value pairs represented as strings. ????

To get started, let’s say you have a list of strings that look like this:

data_list = ["Name: John", "Age: 30", "City: New York"]

You can convert this list into a dictionary using a simple loop and the split() method.

Here’s the recipe:

data_dict = {}

for item in data_list:
    key, value = item.split(": ")
    data_dict[key] = value

print(data_dict)  # Output: {"Name": "John", "Age": "30", "City": "New York"}

Sweet, you just converted your list to a dictionary! ???? But, what if you want to make it more concise? Python offers an elegant solution with dictionary comprehension.

Check this out:

data_dict = {item.split(": ")[0]: item.split(": ")[1] for item in data_list}
print(data_dict)  # Output: {"Name": "John", "Age": "30", "City": "New York"}

With just one line of code, you achieved the same result. High five! ????

When dealing with more complex lists that contain strings in various formats or nested structures, it’s essential to use additional tools like the json.loads() method or the ast.literal_eval() function. But for simple cases like the example above, the loop and dictionary comprehension should be more than enough.

Python Convert List Of Strings To Bytes-Like Object

???? How to convert a list of strings into a bytes-like object in Python? It’s quite simple and can be done easily using the json library and the utf-8 encoding.

Firstly, let’s tackle encoding your list of strings as a JSON string ????. You can use the json.dumps() function to achieve this.

Here’s an example:

import json

your_list = ['hello', 'world']
json_string = json.dumps(your_list)

Now that you have the JSON string, you can convert it to a bytes-like object using the encode() method of the string ????.

Simply specify the encoding you’d like to use, which in this case is 'utf-8':

bytes_object = json_string.encode('utf-8')

And that’s it! Your list of strings has been successfully transformed into a bytes-like object. ???? To recap, here’s the complete code snippet:

import json

your_list = ['hello', 'world']
json_string = json.dumps(your_list)
bytes_object = json_string.encode('utf-8')

If you ever need to decode the bytes-like object back into a list of strings, just use the decode() method followed by the json.loads() function like so:

decoded_string = bytes_object.decode('utf-8')
original_list = json.loads(decoded_string)

Python Convert List Of Strings To Array

Converting a list of strings to an array in Python is a piece of cake ????.

One simple approach is using the NumPy library, which offers powerful tools for working with arrays. To start, make sure you have NumPy installed. Afterward, you can create an array using the numpy.array() function.

Like so:

import numpy as np

string_list = ['apple', 'banana', 'cherry']
string_array = np.array(string_list)

Now your list is enjoying its new life as an array! ????

But sometimes, you may need to convert a list of strings into a specific data structure, like a NumPy character array. For this purpose, numpy.char.array() comes to the rescue:

char_array = np.char.array(string_list)

Now you have a character array! Easy as pie, right? ????

If you want to explore more options, check out the built-in split() method that lets you convert a string into a list, and subsequently into an array. This method is especially handy when you need to split a string based on a separator or a regular expression.

Python Convert List Of Strings To JSON

You’ve probably encountered a situation where you need to convert a list of strings to JSON format in Python. Don’t worry! We’ve got you covered. In this section, we’ll discuss a simple and efficient method to convert a list of strings to JSON using the json module in Python.

First things first, let’s import the necessary module:

import json

Now that you’ve imported the json module, you can use the json.dumps() function to convert your list of strings to a JSON string.

Here’s an example:

string_list = ["apple", "banana", "cherry"]
json_string = json.dumps(string_list)
print(json_string)

This will output the following JSON string:

["apple", "banana", "cherry"]

???? Great job! You’ve successfully converted a list of strings to JSON. But what if your list contains strings that are already in JSON format?

In this case, you can use the json.loads() function:

string_list = ['{"name": "apple", "color": "red"}', '{"name": "banana", "color": "yellow"}']
json_list = [json.loads(string) for string in string_list]
print(json_list)

The output will be:

[{"name": "apple", "color": "red"}, {"name": "banana", "color": "yellow"}]

And that’s it! ???? Now you know how to convert a list of strings to JSON in Python, whether it’s a simple list of strings or a list of strings already in JSON format.

Python Convert List Of Strings To Numpy Array

Are you looking to convert a list of strings to a numpy array in Python? Next, we will briefly discuss how to achieve this using NumPy.

First things first, you need to import numpy. If you don’t have it installed, simply run pip install numpy in your terminal or command prompt.

Once you’ve done that, you can import numpy in your Python script as follows:

import numpy as np

Now that numpy is imported, let’s say you have a list of strings with numbers that you want to convert to a numpy array, like this:

A = ['33.33', '33.33', '33.33', '33.37']

To convert this list of strings into a NumPy array, you can use a simple list comprehension to first convert the strings to floats and then use the numpy array() function to create the numpy array:

floats = [float(e) for e in A]
array_A = np.array(floats)

???? Congratulations! You’ve successfully converted your list of strings to a numpy array! Now that you have your numpy array, you can perform various operations on it. Some common operations include:

  • Finding the mean, min, and max:
mean, min, max = np.mean(array_A), np.min(array_A), np.max(array_A)
  • Reshaping the array:
reshaped_array = array_A.reshape(2, 2)
array_B = np.array([1.0, 2.0, 3.0, 4.0])
result = array_A + array_B

Now you know how to convert a list of strings to a numpy array and perform various operations on it.

Python Convert List of Strings to Numbers

To convert a list of strings to numbers in Python, Python’s map function can be your best friend. It applies a given function to each item in an iterable. To convert a list of strings into a list of numbers, you can use map with either the int or float function.

Here’s an example: ????

string_list = ["1", "2", "3", "4", "5"]
numbers_int = list(map(int, string_list))
numbers_float = list(map(float, string_list))

Alternatively, using list comprehension is another great approach. Just loop through your list of strings and convert each element accordingly.✨

Here’s what it looks like:

numbers_int = [int(x) for x in string_list]
numbers_float = [float(x) for x in string_list]

Maybe you’re working with a list that contains a mix of strings representing integers and floats. In that case, you can implement a conditional list comprehension like this: ????

mixed_list = ["1", "2.5", "3", "4.2", "5"]
numbers_mixed = [int(x) if "." not in x else float(x) for x in mixed_list]

And that’s it! Now you know how to convert a list of strings to a list of numbers using Python, using different techniques like the map function and list comprehension.

Python Convert List Of Strings To Array Of Floats

???? Starting out, you might have a list of strings containing numbers, like ['1.2', '3.4', '5.6'], and you want to convert these strings to an array of floats in Python.

Here’s how you can achieve this seamlessly:

Using List Comprehension

List comprehension is a concise way to create lists in Python. To convert the list of strings to a list of floats, you can use the following code:

list_of_strings = ['1.2', '3.4', '5.6']
list_of_floats = [float(x) for x in list_of_strings]

✨This will give you a new list list_of_floats containing [1.2, 3.4, 5.6].

Using numpy. ????

If you have numpy installed or are working with larger arrays, you might want to convert the list of strings to a numpy array of floats.

Here’s how you can do that:

import numpy as np

list_of_strings = ['1.2', '3.4', '5.6']
numpy_array = np.array(list_of_strings, dtype=float)

Now you have a numpy array of floats: array([1.2, 3.4, 5.6]). ????

Converting Nested Lists

If you’re working with a nested list of strings representing numbers, like:

nested_list_of_strings = [['1.2', '3.4'], ['5.6', '7.8']]

You can use the following list comprehension:

nested_list_of_floats = [[float(x) for x in inner] for inner in nested_list_of_strings]

This will result in a nested list of floats like [[1.2, 3.4], [5.6, 7.8]]. ????


Pheww! Hope this article helped you solve your conversion problems. ????

Free Cheat Sheets! ????

If you want to keep learning Python and improving your skills, feel free to check out our Python cheat sheets (100% free): ????

Be on the Right Side of Change

How “Invisible” Metal Cuts Are Made

https://theawesomer.com/photos/2023/04/thin_line_metal_cuts_t.jpg

How “Invisible” Metal Cuts Are Made

Link

Metal objects like the Metmo Cube are fascinating because they feature parts that are so precisely cut that you can’t see where one piece begins and the other one ends. Science educator Steve Mould explains wire EDM machining, which enables the creation of such incredibly tight-fitting objects.

The Awesomer

Save Money in AWS RDS: Don’t Trust the Defaults

https://www.percona.com/blog/wp-content/uploads/2023/03/lucas.speyer_an_icon_of_an_electronic_cloud_97fa4765-ec96-44fb-b23e-dbe3512b9710-150×150.pngaws rds

Default settings can help you get started quickly – but they can also cost you performance and a higher cloud bill at the end of the month. Want to save money on your AWS RDS bill? I’ll show you some MySQL settings to tune to get better performance, and cost savings, with AWS RDS.

Recently I was engaged in a MySQL Performance Audit for a customer to help troubleshoot performance issues that led to downtime during periods of high traffic on their AWS RDS MySQL instances. During heavy loads, they would see messages about their InnoDB settings in the error logs:

[Note] InnoDB: page_cleaner: 1000ms intended loop took 4460ms. The settings might not be optimal. (flushed=140, during the time.)

This message is normally a side effect of a storage subsystem that is not capable of keeping up with the number of writes (e.g., IOPs) required by MySQL. This is “Hey MySQL, try to write less. I can’t keep up,” which is a common situation when innodb_io_capacity_max is set too high.

After some time of receiving these messages, eventually, they hit performance issues to the point that the server becomes unresponsive for a few minutes. After that, things went back to normal.

Let’s look at the problem and try to gather some context information.

Investigating AWS RDS performance issues

We had a db.m5.8xlarge instance type (32vCPU – 128GB of RAM) with a gp2 storage of 5TB, which should provide up to 10000 IOPS (this is the maximum capacity allowed by gp2), running MySQL 5.7. This is a pretty decent setup, and I don’t see many customers needing to write this many sustained IOPS.

The innodb_io_capacity_max parameter was set to 2000, so the hardware should be able to deliver that many IOPS without major issues. However, gp2 suffers from a tricky way of calculating credits and usage that may drive erroneous conclusions about the real capacity of the storage. Reviewing the CloudWatch graphics, we only had roughly 8-9k IOPS (reads and writes) used during spikes.

AWS RDS MySQL

writeops

While the IO utilization was quite high, there should be some room to get more IOPS, but we were still seeing errors. What caught my attention was the self-healing condition shown by MySQL after a few minutes.

Normally, the common solution that was actually discussed during our kick-off call was, “Well, there is always the chance to move to Provisioned IOPS, but that is quite expensive.” Yes, this is true, io2 volumes are expensive, and honestly, I think they should be used only where really high IO capacity at expected latencies is required, and this didn’t seem to be the case.

Otherwise, most of the environments can adapt to gp2/gp3 volumes; for that matter, you need to provision a big enough volume and get enough IOPS.

Finding the “smoking gun” with pt-mysql-summary

Not too long ago, my colleague Yves Trudeau and I worked on a series of posts debating how to configure an instance for write-intensive workloads. A quick look at the pt-mysql-summary output shows something really interesting when approaching the issue out of the busy period of load:

# InnoDB #####################################################
                  Version | 5.7.38
         Buffer Pool Size | 93.0G
         Buffer Pool Fill | 100%
        Buffer Pool Dirty | 1%
           File Per Table | ON
                Page Size | 16k
            Log File Size | 2 * 128.0M = 256.0M
          Log Buffer Size | 8M
             Flush Method | O_DIRECT
      Flush Log At Commit | 1
               XA Support | ON
                Checksums | ON
              Doublewrite | ON
          R/W I/O Threads | 4 4
             I/O Capacity | 200
       Thread Concurrency | 0
      Concurrency Tickets | 5000
       Commit Concurrency | 0
      Txn Isolation Level | REPEATABLE-READ
        Adaptive Flushing | ON
      Adaptive Checkpoint | 
           Checkpoint Age | 78M
             InnoDB Queue | 0 queries inside InnoDB, 0 queries in queue

 

Wait, what? 256M of redo logs and a Checkpoint Age of only 78M? That is quite conservative, considering a 93GB buffer pool size. I guess we should assume bigger redo logs for such a big buffer pool. Bingo! We have a smoking gun here.

Additionally, full ACID features were enabled, this is innodb_flush_log_at_trx_commit=1 and sync_binlog=1, which adds a lot of write overhead to every operation because, during the commit stage, everything is flushed to disk (or to gp2 in this case).

Considering a spike of load running a lot of writing queries, hitting the max checkpoint age in this setup is a very likely situation.

Basically, MySQL will perform flushing operations at a certain rate depending on several factors. This rate is normally close to innodb_io_capacity (200 by default); if the number of writes starts to approach to max checkpoint age, then the adaptive flushing algorithm will start to push up to innodb_io_capacity_max (2000 by default) to try to keep the free space in the redo logs far from the max checkpoint age limit.

If we keep pushing, we can eventually reach the max checkpoint age, which will drive the system to the synchronous state, meaning that a sort of furious flushing operations will happen beyond innodb_io_capacity_max and all writing operations will be paused (freezing writes) until there is free room in the redo logs to keep writing.

This was exactly what was happening on this server. We calculated roughly how many writes were being performed per hour, and then we recommended increasing the size of redo log files to 2x2GB each (4GB total). In practical terms, it was 3.7G due to some rounding that RDS does, so we got:

# InnoDB #####################################################
                  Version | 5.7.38
         Buffer Pool Size | 92.0G
         Buffer Pool Fill | 100%
        Buffer Pool Dirty | 2%
           File Per Table | ON
                Page Size | 16k
            Log File Size | 2 * 1.9G = 3.7G
          Log Buffer Size | 8M
             Flush Method | O_DIRECT

 

Then we also increased the innodb_io_capacity_max to 4000, so we let the adaptive flushing algorithm increase writes with some more room. Results in CloudWatch show we were right:

 

AWS RDS Cloud MySQL

The reduction during the last couple of weeks is more than 50% of IOPS, which is pretty decent now, and we haven’t changed the hardware at all. Actually, it was possible to reduce the storage size to 3TB and avoid moving to expensive io2 (provisioned IOPS) storage.

Conclusions

RDS normally works very well out of the box; most of the configurations are properly set for the type of instance provisioned. Still, I’ve found that the RDS default size of the redo logs being this small is silly, and people using a fully managed solution would expect not to worry about some common tuning.

MySQL 8.0 implemented innodb_dedicated_server that auto sizes innodb_log_file_size and innodb_log_files_in_group (now replaced by innodb_redo_log_capacity) as a function of InnoDB Buffer Pool size using a pretty simple, but effective, algorithm, and I guess it shouldn’t be hard for AWS team to implement it. We’ve done some research, and it seems RDS is not pushing this login into the 8.0 versions, which sounds strange to have such a default for innodb_redo_log_capacity

In the meantime, checking how RDS MySQL is configured with default parameters is something we all should review to avoid the typical “throwing more hardware solution” – and, by extension, spending more money.

Percona Consultants have decades of experience solving complex database performance issues and design challenges. They’ll work with you to understand your goals and objectives and provide the best, unbiased solutions for your database environment.

 

Learn more about Percona Consulting

 

A personalized Percona Database Performance Audit will help uncover potential performance killers in your current configuration.

 

Get your personalized audit

Percona Database Performance Blog

This Bittersweet Star Trek Short Imagines the Rebirth of the Enterprise-D

https://i.kinja-img.com/gawker-media/image/upload/c_fill,f_auto,fl_progressive,g_center,h_675,pg_1,q_80,w_1200/e97f7e052bb2abd33097b70d7b9a7ebd.png

Screenshot: OTOY/The Roddenberry Archive

The restoration of the Enterprise-D, long thought lost since it was broken up and crash-landed decades prior in Star Trek: Generations, was one of the biggest surprises of Picard’s final season. But now we know that parts of the ship made their way back home to be restored, this touching new short imagines just how.

Weirdest Thing Star Trek Star LeVar Burton Has Signed

Created in partnership with the Roddenberry Archive, cloud-rendering graphics company OTOY has released a short epilogue to Generations aptly titled Re-Generation. It’s short and sweet, and well worth the couple minutes it takes to watch.

765874 – Regeneration (2K)

Using similar technology the company used to “holographically” re-create Star Trek’s original pilot, “The Cage,” the short bookends the two great losses of Generations: the destruction of the Enterprise-D, as its saucer section crash-lands into Veridian III, and the death of Admiral James T. Kirk. It’s a touching pairing, as we see the damaged Enterprise saucer being worked on by Starfleet personnel, while Spock (a digital recreation utilizing footage and scans of Leonard Nimoy) mournfully recovers Kirk’s badge.

Knowing now that the Enterprise-D would make it back to Geordi at the Fleet Museum, restored in time to save the galaxy from the Borg one last time, it’s an fascinating glimpse at the stories in between decades of Star Trek history.


Want more io9 news? Check out when to expect the latest Marvel, Star Wars, and Star Trek releases, what’s next for the DC Universe on film and TV, and everything you need to know about James Cameron’s Avatar: The Way of Water.

Gizmodo

Vaquita Mini Kukri Returns under Civivi Brand

https://knifenews.com/wp-content/uploads/2023/04/C047C-2_VaquitaII_6_500x-296×296.webp

Spring is in full flush over here at KnifeNews HQ, which means that Summer is already on the horizon. Civivi just showcased its first June batch of knives, and it includes a fixed blade from Nate Matlack called the Vaquita II, sequel to a discontinued WE Knife Co. model.

The Vaquita II is clearly a kukri; that much is clear from even a glance at the product page. It has that unmistakable, downturned, dog-leg style blade, possibly adapted from ancient farming implements and typically adept at chopping and demanding brushwork. However, if you look at Vaquita II’s specs, you’ll see that this knife’s blade length measures just 3.2 inches – quite small compared to average examples of the form, putting chopping out of the question.

The Vaquita II is small enough to be carried around the neck

No, in fact, the Vaquita seems designed as a companion outdoors knife, useful for the smaller tasks when out on the trail. There’s also an EDC consideration here, unconventional as the kukri may be in that role; if you demand a lot of slicing horsepower the bellied out blade will provide that in spades. In either use case the steel here, Nitro-V, should serve admirably well. As a sort of nitrogen-enhanced AEB-L, Nitro-V is quite stain resistant, adequately tough, and holds an edge longer than many other common budget-friendly stainless choices.

Although it’s smaller than the usual kukri, the Vaquita II does have a nice, sensible handle, with cuts that make it intuitively ergonomic and visually striking; a big groove for the first finger, and a bigger one behind for the others; G-10 scales, affixed by large torx screws, lay on top of the full frame and come in three different colors. The Vaquita II comes with a Kydex sheath and a length of beaded chain, suggesting that it may even be small enough to carry around the neck.

The Vaquita II is a Civivi-fied followup to the original Vaquita, a 2018 release. That model came with carbon fiber scales and S35VN blade steel. It has since been discontinued.

The Civivi Vaquita is slated for release in June.

Knife in Featured Image: Civivi Vaquita II


The information provided by KnifeNews.com (the “Site”) is for general recreational purposes only. The views and opinions expressed on the Site are those of the author or those quoted and do not necessarily reflect the views of any entities they represent. All information on the Site is provided in good faith, however, we make no representation or warranty of any kind, express or implied, regarding the accuracy, adequacy, validity, reliability, availability, or completeness of the information on the Site. Under no circumstance shall we have any liability to you for any loss or damage as the result of the use of the Site or reliance on any information provided. Your use of the Site and your reliance on any information on the Site is solely at your own risk.

The post Vaquita Mini Kukri Returns under Civivi Brand appeared first on KnifeNews.

KnifeNews

CodersLegacy: 10 Most Important Functions in BeautifulSoup

Beautiful Soup is a Python library that is commonly used for web scraping purposes. It is a very powerful tool for extracting and parsing data from HTML and XML files. Beautiful Soup provides several functions that make web scraping a lot easier. In this article, we will look at the 10 most important BeautifulSoup functions and how to use them to parse data.


1. BeautifulSoup()

The BeautifulSoup() function is used to create a Beautiful Soup object. This object represents the parsed HTML/XML document. It takes two arguments: the HTML/XML document as a string and the parser to be used. The parser is optional, and if it is not specified, Beautiful Soup will automatically select one based on the document.

from bs4 import BeautifulSoup

html_doc = """
<html>
  <head>
    <title>The Title</title>
  </head>
  <body>
    <p class='text'>Some text.</p>
  </body>
</html>"""

soup = BeautifulSoup(html_doc, 'html.parser')

In this example, we are creating a Beautiful Soup object from an HTML string using the html.parser parser. Printing out the soup object will show you all the html it currently has stored within it.


2. find()

The find() function is used to find the first occurrence of a tag in the HTML/XML document. It takes two arguments: the name of the tag and any attributes associated with it. The attributes are optional.

from bs4 import BeautifulSoup

html_doc = """
<html>
  <head>
    <title>The Title</title>
  </head>
  <body>
    <p class='text'>Some text.</p>
  </body>
</html>"""

soup = BeautifulSoup(html_doc, 'html.parser')
p_tag = soup.find('p', {'class': 'text'})
print(p_tag)
<p class="text">Some text.</p>

In this example, we are finding the first occurrence of the p tag with the class attribute set to 'text'.


3. find_all()

The find_all() function is used to find all occurrences of a tag in the HTML/XML document. It takes the same arguments as find().

from bs4 import BeautifulSoup

html_doc = """
<html>
    <head>
        <title>The Title</title>
    </head>
    <body>
        <p class='text'>Some text.</p>
        <p class='text'>More text.</p>
    </body>
</html>"""

soup = BeautifulSoup(html_doc, 'html.parser')
p_tags = soup.find_all('p', {'class': 'text'})
print(p_tags)
[<p class="text">Some text.</p>, <p class="text">More text.</p>]

In this example, we are finding all occurrences of the p tag with the class attribute set to 'text'.


4. get_text()

The get_text() function is used to get the text content of a tag. It takes no arguments.

from bs4 import BeautifulSoup

html_doc = """
<html>
  <head>
    <title>The Title</title>
  </head>
  <body>
    <p class='text'>Some text.</p>
  </body>
</html>"""

soup = BeautifulSoup(html_doc, 'html.parser')
p_tag = soup.find('p', {'class': 'text'})
text = p_tag.get_text()

print(text)
Some text.

In this example, we are getting the text content of the p tag we found earlier.


5. get()

The get() function is used to get the value of an attribute of a tag. It takes one argument, which is the name of the attribute.

from bs4 import BeautifulSoup

html_doc = """
<html>
  <head>
    <title>The Title</title>
  </head>
  <body>
    <p class='text'>Some text.</p>
  </body>
</html>"""

soup = BeautifulSoup(html_doc, 'html.parser') 
p_tag = soup.find('p', {'class': 'text'}) 
class_attribute = p_tag.get('class')

print(class_attribute)
['text']

In this example, we are getting the value of the class attribute of the p tag we found earlier. This works for other attributes like “href” and “id” as well.


6. find_parent()

The find_parent() function is used to find the parent tag of a given tag. It takes no arguments.

from bs4 import BeautifulSoup

html_doc = """
<html>
    <head>
        <title>The Title</title>
    </head>
    <body>
        <div>
            <p class='text'>Some text.</p>
        </div>
    </body>
</html>"""

soup = BeautifulSoup(html_doc, 'html.parser') 
p_tag = soup.find('p', {'class': 'text'}) 
div_tag = p_tag.find_parent('div')

print(div_tag)
<div>
<p class="text">Some text.</p>
</div>

In this example, we are finding the parent div tag of the p tag we found earlier.


7. find_next_sibling()

The find_next_sibling() function is used to find the next sibling tag of a given tag. It takes no arguments.

from bs4 import BeautifulSoup

html_doc = """
<html>
    <head>
        <title>The Title</title>
    </head>
    <body>
        <p class='text'>Some text.</p>
        <p class='text'>More text.</p>
    </body>
</html>"""

soup = BeautifulSoup(html_doc, 'html.parser') 
p_tag = soup.find('p', {'class': 'text'}) 
next_p_tag = p_tag.find_next_sibling('p')

print(next_p_tag)
<p class="text">More text.</p>

In this example, we are finding the next p tag that comes after the p tag we found earlier.


8. find_all_next()

The find_all_next() function is used to find all the tags that come after a given tag in the HTML/XML document. It takes no arguments.

from bs4 import BeautifulSoup

html_doc = """
<html>
  <head>
    <title>The Title</title>
  </head>
  <body>
    <div>
      <p class='text'>Some text.</p>
      <p class='text'>More text.</p>
      <span class='text'>Some more text.</span>
    </div>
  </body>
</html>"""

soup = BeautifulSoup(html_doc, 'html.parser') 
p_tag = soup.find('p', {'class': 'text'}) 
next_tags = p_tag.find_all_next()

print(next_tags)
[<p class="text">More text.</p>, <span class="text">Some more text.</span>]

In this example, we are finding all the tags that come after the p tag we found earlier.


9. select()

The select() function is one of the most important functions in BeautifulSoup, used to select tags based on CSS selectors. It takes one argument, which is the CSS selector.

from bs4 import BeautifulSoup

html_doc = """

<html>
    <head>
        <title>The Title</title>
    </head>
    <body>
        <div>
            <p class='text'>Some text.</p>
        </div>
        <div>
            <p class='text'>More text.</p>
        </div>
    </body>
</html>
"""

soup = BeautifulSoup(html_doc, 'html.parser') 
p_tags = soup.select('div > p.text')
print(p_tags)
[<p class="text">Some text.</p>, <p class="text">More text.</p>]

In this example, we are selecting all the p tags with the class attribute set to ‘text’ that are inside a div tag.


10. prettify()

The prettify() function is used to make the HTML/XML document more human-readable. It takes no arguments.

from bs4 import BeautifulSoup

html_doc = """<html><head><title>The Title</title></head><body><p class='text'>Some text.</p></body></html> """

soup = BeautifulSoup(html_doc, 'html.parser') 
prettified_html = soup.prettify()
print(prettified_html)
<html>
 <head>
  <title>
   The Title
  </title>
 </head>
 <body>
  <p class="text">
   Some text.
  </p>
 </body>
</html>

In this example, we are making the HTML document more human-readable using the prettify() function.


Conclusion

Beautiful Soup is a powerful tool for web scraping in Python. In this article, we have covered the 10 most important functions of Beautiful Soup and how to use them to parse data from HTML and XML files. These functions are just a few of the many functions provided by Beautiful Soup, and by mastering them, you can become an expert in web scraping with Python.

This marks the end of the 10 most Important Functions in BeautifulSoup article. Any suggestions or contributions for CodersLegacy are more than welcome. Questions regarding the tutorial content can be asked in the comments section below.

The post 10 Most Important Functions in BeautifulSoup appeared first on CodersLegacy.

Planet Python

Python Web Scraping: From URL to CSV in No Time

https://s.w.org/images/core/emoji/14.0.0/72×72/1f4a1.png

4/5 – (1 vote)

Setting up the Environment

Before diving into web scraping with Python, set up your environment by installing the necessary libraries.

First, install the following libraries: requests, BeautifulSoup, and pandas. These packages play a crucial role in web scraping, each serving different purposes.✨

To install these libraries, click on the previously provided links for a full guide (including troubleshooting) or simply run the following commands:

pip install requests
pip install beautifulsoup4
pip install pandas

The requests library will be used to make HTTP requests to websites and download the HTML content. It simplifies the process of fetching web content in Python.

BeautifulSoup is a fantastic library that helps extract data from the HTML content fetched from websites. It makes navigating, searching, and modifying HTML easy, making web scraping straightforward and convenient.

Pandas will be helpful in data manipulation and organizing the scraped data into a CSV file. It provides powerful tools for working with structured data, making it popular among data scientists and web scraping enthusiasts. ????

Fetching and Parsing URL

Next, you’ll learn how to fetch and parse URLs using Python to scrape data and save it as a CSV file. We will cover sending HTTP requests, handling errors, and utilizing libraries to make the process efficient and smooth. ????

Sending HTTP Requests

When fetching content from a URL, Python offers a powerful library known as the requests library. It allows users to send HTTP requests, such as GET or POST, to a specific URL, obtain a response, and parse it for information.

We will use the requests library to help us fetch data from our desired URL.

For example:

import requests
response = requests.get('https://example.com/data.csv')

The variable response will store the server’s response, including the data we want to scrape. From here, we can access the content using response.content, which will return the raw data in bytes format. ????

Handling HTTP Errors

Handling HTTP errors while fetching data from URLs ensures a smooth experience and prevents unexpected issues. The requests library makes error handling easy by providing methods to check whether the request was successful.

Here’s a simple example:

import requests
response = requests.get('https://example.com/data.csv')
response.raise_for_status()

The raise_for_status() method will raise an exception if there’s an HTTP error, such as a 404 Not Found or 500 Internal Server Error. This helps us ensure that our script doesn’t continue to process erroneous data, allowing us to gracefully handle any issues that may arise. ????

With these tools, you are now better equipped to fetch and parse URLs using Python. This will enable you to effectively scrape data and save it as a CSV file. ????

Extracting Data from HTML

In this section, we’ll discuss extracting data from HTML using Python. The focus will be on utilizing the BeautifulSoup library, locating elements by their tags, and attributes. ????

Using BeautifulSoup

BeautifulSoup is a popular Python library that simplifies web scraping tasks by making it easy to parse and navigate through HTML. To get started, import the library and request the page content you want to scrape, then create a BeautifulSoup object to parse the data:

from bs4 import BeautifulSoup
import requests

url = "example_website"
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")

Now you have a BeautifulSoup object and can start extracting data from the HTML. ????

Locating Elements by Tags and Attributes

BeautifulSoup provides various methods to locate elements by their tags and attributes. Some common methods include find(), find_all(), select(), and select_one().

Let’s see these methods in action:

# Find the first <span> tag
span_tag = soup.find("span")

# Find all <span> tags
all_span_tags = soup.find_all("span")

# Locate elements using CSS selectors
title = soup.select_one("title")

# Find all <a> tags with the "href" attribute
links = soup.find_all("a", {"href": True})

These methods allow you to easily navigate and extract data from an HTML structure. ????

Once you have located the HTML elements containing the needed data, you can extract the text and attributes.

Here’s how:

# Extract text from a tag
text = span_tag.text

# Extract an attribute value
url = links[0]["href"]

Finally, to save the extracted data into a CSV file, you can use Python’s built-in csv module. ????

import csv

# Writing extracted data to a CSV file
with open("output.csv", "w", newline="") as csvfile:
    writer = csv.writer(csvfile)
    writer.writerow(["Index", "Title"])
    for index, link in enumerate(links, start=1):
        writer.writerow([index, link.text])

Following these steps, you can successfully extract data from HTML using Python and BeautifulSoup, and save it as a CSV file. ????

???? Recommended: Basketball Statistics – Page Scraping Using Python and BeautifulSoup

Organizing Data

This section explains how to create a dictionary to store the scraped data and how to write the organized data into a CSV file. ????

Creating a Dictionary

Begin by defining an empty dictionary that will store the extracted data elements.

In this case, the focus is on quotes, authors, and any associated tags. Each extracted element should have its key, and the value should be a list that contains individual instances of that element.

For example:

data = {
    "quotes": [],
    "authors": [],
    "tags": []
}

As you scrape the data, append each item to its respective list. This approach makes the information easy to index and retrieve when needed. ????

Working with DataFrames and Pandas

Once the data is stored in a dictionary, it’s time to convert it into a dataframe. Using the Pandas library, it’s easy to transform the dictionary into a dataframe where the keys become the column names and the respective lists become the rows.

Simply use the following command:

import pandas as pd

df = pd.DataFrame(data)

Exporting Data to a CSV File

With the dataframe prepared, it’s time to write it to a CSV file. Thankfully, Pandas comes to the rescue once again. Using the dataframe’s built-in .to_csv() method, it’s possible to create a CSV file from the dataframe, like this:

df.to_csv('scraped_data.csv', index=False)

This command will generate a CSV file called 'scraped_data.csv' containing the organized data with columns for quotes, authors, and tags. The index=False parameter ensures that the dataframe’s index isn’t added as an additional column. ????

???? Recommended: 17 Ways to Read a CSV File to a Pandas DataFrame

And there you have it—a neat, organized CSV file containing your scraped data!

Handling Pagination

This section will discuss how to handle pagination while scraping data from multiple URLs using Python to save the extracted content in a CSV format. It is essential to manage pagination effectively because most websites display their content across several pages.????

Looping Through Web Pages

Looping through web pages requires the developer to identify a pattern in the URLs, which can assist in iterating over them seamlessly. Typically, this pattern would include the page number as a variable, making it easy to adjust during the scraping process.????

Once the pattern is identified, you can use a for loop to iterate over a range of page numbers. For each iteration, update the URL with the page number and then proceed with the scraping process. This method allows you to extract data from multiple pages systematically.????

For instance, let’s consider that the base URL for every page is "https://www.example.com/listing?page=", where the page number is appended to the end.

Here is a Python example that demonstrates handling pagination when working with such URLs:

import requests
from bs4 import BeautifulSoup
import csv

base_url = "https://www.example.com/listing?page="

with open("scraped_data.csv", "w", newline="") as csvfile:
    csv_writer = csv.writer(csvfile)
    csv_writer.writerow(["Data_Title", "Data_Content"])  # Header row

    for page_number in range(1, 6):  # Loop through page numbers 1 to 5
        url = base_url + str(page_number)
        response = requests.get(url)
        soup = BeautifulSoup(response.text, "html.parser")
        
        # TODO: Add scraping logic here and write the content to CSV file.????

In this example, the script iterates through the first five pages of the website and writes the scraped content to a CSV file. Note that you will need to implement the actual scraping logic (e.g., extracting the desired content using Beautiful Soup) based on the website’s structure.????

Handling pagination with Python allows you to collect more comprehensive data sets????, improving the overall success of your web scraping efforts. Make sure to respect the website’s robots.txt rules and rate limits to ensure responsible data collection.????

Exporting Data to CSV

You can export web scraping data to a CSV file in Python using the Python CSV module and the Pandas to_csv function. ???? Both approaches are widely used and efficiently handle large amounts of data.

Python CSV Module

The Python CSV module is a built-in library that offers functionalities to read from and write to CSV files. It is simple and easy to use????. To begin with, first, import the csv module.

import csv

To write the scraped data to a CSV file, open the file in write mode ('w') with a specified file name, create a CSV writer object, and write the data using the writerow() or writerows() methods as required.

with open('data.csv', 'w', newline='') as file:
    writer = csv.writer(file)
    writer.writerow(["header1", "header2", "header3"])
    writer.writerows(scraped_data)

In this example, the header row is written first, followed by the rows of data obtained through web scraping. ????

Using Pandas to_csv()

Another alternative is the powerful library Pandas, often used in data manipulation and analysis. To use it, start by importing the Pandas library.

import pandas as pd

Pandas offers the to_csv() method, which can be applied to a DataFrame. If you have web-scraped data and stored it in a DataFrame, you can easily export it to a CSV file with the to_csv() method, as shown below:

dataframe.to_csv('data.csv', index=False)

In this example, the index parameter is set to False to exclude the DataFrame index from the CSV file. ????

The Pandas library also provides options for handling missing values, date formatting, and customizing separators and delimiters, making it a versatile choice for data export.

10 Minutes to Pandas in 5 Minutes

If you’re just getting started with Pandas, I’d recommend you check out our free blog guide (it’s only 5 minutes!): ????

???? Recommended: 5 Minutes to Pandas — A Simple Helpful Guide to the Most Important Pandas Concepts (+ Cheat Sheet)

Be on the Right Side of Change

Python for Beginners: Pandas Where Method With Series and DataFrame

While working with pandas dataframe, we often filter data using different conditions. In this article, we will discuss how we can use the pandas where method to filter and replace data from a series or dataframe.

The Pandas where() Method

We use the pandas where() method to replace a value based on a condition. The where() method has the following syntax.

DataFrame.where(cond, other=_NoDefault.no_default, *, inplace=False, axis=None, level=None)

Here, 

  • The cond parameter takes a condition or multiple conditional statements as input arguments. The conditional statements must evaluate to a series of True and False values. If the cond parameter is True for a row, the data is preserved in that row. All the values are set to None for the rows where the cond parameter evaluates to False. 
  • The other parameter takes a function, series, dataframe, or scaler value as its input argument. All the entries where the cond parameter evaluates to False are replaced with the corresponding value from the other parameter. If we pass a function to the other parameter, it is computed on the DataFrame and should return scalar or Series/DataFrame. The function must not change the input DataFrame. If we don’t specify the other parameter, all the values are set to None for the rows where the cond parameter evaluates to False. 
  • By default, the where() method returns a new dataframe after execution. If you want to modify the existing dataframe using the where() method, you can set the inplace parameter to True. After this, the original dataframe will be modified to store the output.
  • The axis parameter is used to set the alignment axis if needed. For Series, the axis parameter is unused. For dataframes, it has a default value of 0.
  • The level parameter is used to set the alignment level if required.

Now, let us discuss how we can use the where() method with a series or a dataframe.

Pandas Where() Method With Series in Python

When we invoke the where() method on a pandas series, it takes a condition as its input argument. After execution, it returns a new series. In the output series, the values that fulfill the condition in the input argument and unchanged while the rest of the values are set to None. You can observe this in the following example.

import pandas as pd
series=pd.Series([1,23,12,423,4,53,231,234,1])
print("The input series is:")
print(series)
output=series.where(series>50)
print("The output series is:")
print(output)

Output:

The input series is:
0      1
1     23
2     12
3    423
4      4
5     53
6    231
7    234
8      1
dtype: int64
The output series is:
0      NaN
1      NaN
2      NaN
3    423.0
4      NaN
5     53.0
6    231.0
7    234.0
8      NaN
dtype: float64

In the above example, we passed the condition series>50 to the where() method. In the output series, you can observe that the where() method preserves the numbers greater than 50. On the other hand, values less than 50 are set to None.

Replace a Value Based on a Condition Using The where() Method

Instead of None, we can also set a replacement value for the values in the series that don’t fulfill the condition given in the input to the where() method. For this, we will pass the replacement value as the second input argument to the where() method. After execution, it returns a series in which the values that fulfill the condition remain unchanged while the other values are replaced using the replacement value. You can observe this in the following example.

import pandas as pd
series=pd.Series([1,23,12,423,4,53,231,234,1])
print("The input series is:")
print(series)
output=series.where(series>50,-1)
print("The output series is:")
print(output)

Output:

The input series is:
0      1
1     23
2     12
3    423
4      4
5     53
6    231
7    234
8      1
dtype: int64
The output series is:
0     -1
1     -1
2     -1
3    423
4     -1
5     53
6    231
7    234
8     -1
dtype: int64

In the above example, we have set the other parameter to -1. Hence, the numbers less than 50 are set to -1 in the output dataframe.

Replace a Value Using a Function Based on a Condition Using The where() Method

Instead of a value, we can also pass a function for replacing the values in the series using the where() method. For instance, consider the following example.

def myFun(x):
    return x**2
import pandas as pd
series=pd.Series([1,23,12,423,4,53,231,234,1])
print("The input series is:")
print(series)
output=series.where(series>50,other=myFun)
print("The output series is:")
print(output)

Output:

The input series is:
0      1
1     23
2     12
3    423
4      4
5     53
6    231
7    234
8      1
dtype: int64
The output series is:
0      1
1    529
2    144
3    423
4     16
5     53
6    231
7    234
8      1
dtype: int64

In the above code, we have defined a function myFun() that takes a number and returns its square. Then, we passed the function to the other parameter in the where() method. After this, the values less than 50 are first passed to the function myFun(). The where() method then gives the output of myFun() function in the output series in all the positions where the cond parameter is False.

Pandas Where Method With DataFrame

Instead of a series, we can also use the where() method on a dataframe. When we invoke the where() method on a dataframe, it takes a condition as its input argument. After execution, it returns a dataframe created from the input dataframe.

Here, the rows that fulfill the condition given as input to the where() method remain unchanged. All the other rows are filled with a None value. You can observe this in the following example.

import pandas as pd
myDicts=[{"Roll":1,"Maths":100, "Physics":80, "Chemistry": 90},
        {"Roll":2,"Maths":80, "Physics":100, "Chemistry": 90},
        {"Roll":3,"Maths":90, "Physics":80, "Chemistry": 70},
        {"Roll":4,"Maths":100, "Physics":100, "Chemistry": 90},
        {"Roll":5,"Maths":90, "Physics":90, "Chemistry": 80},
        {"Roll":6,"Maths":80, "Physics":70, "Chemistry": 70}]
df=pd.DataFrame(myDicts)
print("The input dataframe is:")
print(df)
df1=df.where(df["Maths"]>80)
print("The output dataframe is:")
print(df1)

Output:

The input dataframe is:
   Roll  Maths  Physics  Chemistry
0     1    100       80         90
1     2     80      100         90
2     3     90       80         70
3     4    100      100         90
4     5     90       90         80
5     6     80       70         70
The output dataframe is:
   Roll  Maths  Physics  Chemistry
0   1.0  100.0     80.0       90.0
1   NaN    NaN      NaN        NaN
2   3.0   90.0     80.0       70.0
3   4.0  100.0    100.0       90.0
4   5.0   90.0     90.0       80.0
5   NaN    NaN      NaN        NaN

Instead of the None value, we can also give a replacement value to the where() method as shown below.

import pandas as pd
myDicts=[{"Roll":1,"Maths":100, "Physics":80, "Chemistry": 90},
        {"Roll":2,"Maths":80, "Physics":100, "Chemistry": 90},
        {"Roll":3,"Maths":90, "Physics":80, "Chemistry": 70},
        {"Roll":4,"Maths":100, "Physics":100, "Chemistry": 90},
        {"Roll":5,"Maths":90, "Physics":90, "Chemistry": 80},
        {"Roll":6,"Maths":80, "Physics":70, "Chemistry": 70}]
df=pd.DataFrame(myDicts)
print("The input dataframe is:")
print(df)
df1=df.where(df["Maths"]>80,"LOW")
print("The output dataframe is:")
print(df1)

Output:

The input dataframe is:
   Roll  Maths  Physics  Chemistry
0     1    100       80         90
1     2     80      100         90
2     3     90       80         70
3     4    100      100         90
4     5     90       90         80
5     6     80       70         70
The output dataframe is:
  Roll Maths Physics Chemistry
0    1   100      80        90
1  LOW   LOW     LOW       LOW
2    3    90      80        70
3    4   100     100        90
4    5    90      90        80
5  LOW   LOW     LOW       LOW

In the above examples, you can observe that the where() method works in a similar manner it works with a series. The only difference is that the results are applied to the entire row instead of a single value.

Pandas where() Method With Multiple Conditions

We can also use multiple conditions in a single where method. For this, we will operate all the conditions with AND/OR logical operator. After the execution of each condition, the logical operations are performed and we get a mask containing True and False values. The mask is then used to create the output dataframe. You can observe this in the following example.

import pandas as pd
myDicts=[{"Roll":1,"Maths":100, "Physics":80, "Chemistry": 90},
        {"Roll":2,"Maths":80, "Physics":100, "Chemistry": 90},
        {"Roll":3,"Maths":90, "Physics":80, "Chemistry": 70},
        {"Roll":4,"Maths":100, "Physics":100, "Chemistry": 90},
        {"Roll":5,"Maths":90, "Physics":90, "Chemistry": 80},
        {"Roll":6,"Maths":80, "Physics":70, "Chemistry": 70}]
df=pd.DataFrame(myDicts)
print("The input dataframe is:")
print(df)
df1=df.where((df["Maths"]>80) & (df["Chemistry"]>80))
print("The output dataframe is:")
print(df1)

Output:

The input dataframe is:
   Roll  Maths  Physics  Chemistry
0     1    100       80         90
1     2     80      100         90
2     3     90       80         70
3     4    100      100         90
4     5     90       90         80
5     6     80       70         70
The output dataframe is:
   Roll  Maths  Physics  Chemistry
0   1.0  100.0     80.0       90.0
1   NaN    NaN      NaN        NaN
2   NaN    NaN      NaN        NaN
3   4.0  100.0    100.0       90.0
4   NaN    NaN      NaN        NaN
5   NaN    NaN      NaN        NaN

Conclusion

In this article, we discussed different ways to use the pandas where method with a series or dataframe in Python. To learn more about Python programming, you can read this article on how to read excel into pandas dataframe. You might also like this article on how to map functions to a pandas series in Python.

I hope you enjoyed reading this article. Stay tuned for more informative articles.

Happy Learning!

The post Pandas Where Method With Series and DataFrame appeared first on PythonForBeginners.com.

Planet Python