Python for Beginners: PySpark Read CSV File With Examples

The csv file format is one of the most used file formats to store tabular data. In this article, we will discuss different ways to read a csv file in PySpark.

Pyspark Read CSV File Using The csv() Method

To read a csv file to create a pyspark dataframe, we can use the DataFrame.csv() method. The csv() method takes the filename of the csv file and returns a pyspark dataframe as shown below. 

import pyspark.sql as ps
spark = ps.SparkSession.builder \
      .master("local[*]") \
      .appName("readcsv_example") \
      .getOrCreate()

dfs=spark.read.csv("sample_csv_file.csv")
print("The input csv file is:")
dfs.show()
spark.sparkContext.stop()

Output:

The input csv file is:
+-------+-----+-------+---------+
|    _c0|  _c1|    _c2|      _c3|
+-------+-----+-------+---------+
|   Name|Maths|Physics|Chemistry|
| Aditya|   45|     89|       1 |
|  Chris|   86|     85|        2|
|   Joel| null|     85|       3 |
|Katrina|   49|     47|        4|
| Agatha|   76|     89|        5|
|    Sam|   76|     98|        6|
+-------+-----+-------+---------+

In the above example, we have used the following CSV file.

In the output, you can observe that the column names are given as _c0,_c1, _c2. However, the actual column names should be Name, Maths, Physics, and Chemistry. Hence, we need to find a way to read the csv with its column names.

Read CSV With Header as Column Names

To read a csv file with column names, you can use the header parameter in the csv() method. When we set the header parameter to True in the csv() method, the first row of the csv file is treated as the column names. You can observe this in the following example.

import pyspark.sql as ps
spark = ps.SparkSession.builder \
      .master("local[*]") \
      .appName("readcsv_example") \
      .getOrCreate()

dfs=spark.read.csv("sample_csv_file.csv",header=True)
print("The input csv file is:")
dfs.show()
spark.sparkContext.stop()

Output:

The input csv file is:
+-------+-----+-------+---------+
|   Name|Maths|Physics|Chemistry|
+-------+-----+-------+---------+
| Aditya|   45|     89|       1 |
|  Chris|   86|     85|        2|
|   Joel| null|     85|       3 |
|Katrina|   49|     47|        4|
| Agatha|   76|     89|        5|
|    Sam|   76|     98|        6|
+-------+-----+-------+---------+

In this example, we have set the header parameter to True in the csv() method. Hence, the first line of the csv file is read as column names.

Read CSV With inferSchema Parameter

By default, the csv() method reads all the values as strings. For example, if we print the data types using the dtypes attribute of the pyspark dataframe, you can observe that all the column names have string data types. 

import pyspark.sql as ps
spark = ps.SparkSession.builder \
      .master("local[*]") \
      .appName("readcsv_example") \
      .getOrCreate()

dfs=spark.read.csv("sample_csv_file.csv",header=True)
print("The input csv file is:")
dfs.show()
print("The data type of columns is:")
print(dfs.dtypes)
spark.sparkContext.stop()

Output:

The input csv file is:
+-------+-----+-------+---------+
|   Name|Maths|Physics|Chemistry|
+-------+-----+-------+---------+
| Aditya|   45|     89|       1 |
|  Chris|   86|     85|        2|
|   Joel| null|     85|       3 |
|Katrina|   49|     47|        4|
| Agatha|   76|     89|        5|
|    Sam|   76|     98|        6|
+-------+-----+-------+---------+

The data type of columns is:
[('Name', 'string'), ('Maths', 'string'), ('Physics', 'string'), ('Chemistry', 'string')]

In the above output, you can observe that all the columns have string data types irrespective of the values in the columns.

To read a csv file with correct data types for columns, we can use the inferSchema parameter in the csv() method. When we set the inferSchema parameter to True, the program scans all the values in the dataframe and assigns the best data type to each column. You can observe this in the following example.

import pyspark.sql as ps
spark = ps.SparkSession.builder \
      .master("local[*]") \
      .appName("readcsv_example") \
      .getOrCreate()

dfs=spark.read.csv("sample_csv_file.csv",header=True,inferSchema=True)
print("The input csv file is:")
dfs.show()
print("The data type of columns is:")
print(dfs.dtypes)
spark.sparkContext.stop()

Output:

The input csv file is:
+-------+-----+-------+---------+
|   Name|Maths|Physics|Chemistry|
+-------+-----+-------+---------+
| Aditya|   45|     89|      1.0|
|  Chris|   86|     85|      2.0|
|   Joel| null|     85|      3.0|
|Katrina|   49|     47|      4.0|
| Agatha|   76|     89|      5.0|
|    Sam|   76|     98|      6.0|
+-------+-----+-------+---------+

The data type of columns is:
[('Name', 'string'), ('Maths', 'int'), ('Physics', 'int'), ('Chemistry', 'double')]

In this example, we have set the inferSchema parameter to True. Hence, the columns are given proper data types.

Why Should You Avoid Using The inferSchema Parameter in PySpark?

Using the inferSchema parameter to decide the data type for columns in a pyspark dataframe is a costly operation. When we set the inferSchema parameter to True, the program needs to scan all the values in the csv file. After scanning all the values in a given column, the data type for the particular column is decided. For large datasets, this can be a costly operation. This is why setting the inferSchema parameter to True is a costly operation and it isn’t recommended for large datasets.

PySpark Read CSV File With Schema

Instead of using the inferSchema parameter, we can read csv files with specified schemas. 

A schema contains the column names, their data types, and a boolean value nullable to specify if a particular column can contain null values or not. 

To define the schema for a pyspark dataframe, we use the StructType() function and the StructField() function. 

The StructField() function is used to define the name and data type of a particular column. It takes the column name as its first input argument and the data type of the column as its second input argument.  To specify the data type of the column names, we use the StringType(), IntegerType(), FloatType(), DoubleType(), and other functions defined in the pyspark.sql.types module. 

In the third input argument to the StructField() function, we pass True or False specifying if the column can contain null values or not. If we set the third parameter to True, the column will allow null values. Otherwise, it will not.

The StructType() function is used to create the schema for the pyspark dataframe. It takes a list of StructField objects as its input argument and returns a StructType object that we can use as a schema.

To read a csv file with schema using pyspark, we will use the following steps.

  1. First, we will define the data type for each column using the StructField() function.
  2. Next, we will pass a list of all the StructField objects to the StructType() function to create a schema.
  3. Finally, we will pass the StructType object to the schema parameter in the csv() function while reading the csv file.

By executing the above steps, we can read a csv file in pyspark with a given schema. You can observe this in the following example.

import pyspark.sql as ps
from pyspark.sql.types import StructType, StructField, IntegerType, StringType
spark = ps.SparkSession.builder \
      .master("local[*]") \
      .appName("readcsv_example") \
      .getOrCreate()
list_of_cols=[StructField("Name",StringType(),True),
             StructField("Maths",IntegerType(),True),
             StructField("Physics",IntegerType(),True),
             StructField("Chemistry",IntegerType(),True)]
schema=StructType(list_of_cols)
print("The schema is:")
print(schema)
spark.sparkContext.stop()

Output:

The schema is:
StructType([StructField('Name', StringType(), True), StructField('Maths', IntegerType(), True), StructField('Physics', IntegerType(), True), StructField('Chemistry', IntegerType(), True)])

In the above code, we have defined the schema for the csv file using the StructField() function and the StructType() function.

After defining the schema, you can pass it to the csv() method to read the csv file with a proper data type for each column as shown in the following example.

import pyspark.sql as ps
from pyspark.sql.types import StructType, StructField, IntegerType, StringType
spark = ps.SparkSession.builder \
      .master("local[*]") \
      .appName("readcsv_example") \
      .getOrCreate()
list_of_cols=[StructField("Name",StringType(),True),
             StructField("Maths",IntegerType(),True),
             StructField("Physics",IntegerType(),True),
             StructField("Chemistry",IntegerType(),True)]
schema=StructType(list_of_cols)
dfs=spark.read.csv("sample_csv_file.csv",header=True,schema=schema)
print("The input csv file is:")
dfs.show()
print("The data type of columns is:")
print(dfs.dtypes)
spark.sparkContext.stop()

Output:

The input csv file is:
+-------+-----+-------+---------+
|   Name|Maths|Physics|Chemistry|
+-------+-----+-------+---------+
| Aditya|   45|     89|     null|
|  Chris|   86|     85|        2|
|   Joel| null|     85|     null|
|Katrina|   49|     47|        4|
| Agatha|   76|     89|        5|
|    Sam|   76|     98|        6|
+-------+-----+-------+---------+

The data type of columns is:
[('Name', 'string'), ('Maths', 'int'), ('Physics', 'int'), ('Chemistry', 'int')]

In the above example, we have read a csv using schema. Observe that the values in a column that cannot be converted to the given data type in the schema are replaced with null values.

Read CSV With Different Delimiter in PySpark

The csv files need not contain the comma character as its delimiter. They might also contain characters like tabs, spaces, colons (:), semi-colons (;), pipe characters (|), etc as delimiters. For example, let us take the following file that uses the pipe character as the delimiter.

To read a csv file in pyspark with a given delimiter, you can use the sep parameter in the csv() method. The csv() method takes the delimiter as an input argument to the sep parameter and returns the pyspark dataframe as shown below.

import pyspark.sql as ps
from pyspark.sql.types import StructType, StructField, IntegerType, StringType
spark = ps.SparkSession.builder \
      .master("local[*]") \
      .appName("readcsv_example") \
      .getOrCreate()
dfs=spark.read.csv("demo_file.csv",header=True,inferSchema=True, sep="|")
print("The input csv file is:")
dfs.show()
print("The data type of columns is:")
print(dfs.dtypes)
spark.sparkContext.stop()

Output:

The input csv file is:
+------+----+----------+-----+
|  Name|Roll|  Language|Extra|
+------+----+----------+-----+
|Aditya|   1|    Python|   11|
|   Sam|   2|      Java|   12|
| Chris|   3|       C++|   13|
|  Joel|   4|TypeScript|   14|
+------+----+----------+-----+

The data type of columns is:
[('Name', 'string'), ('Roll', 'int'), ('Language', 'string'), ('Extra', 'int')]

In the above example, the csv file contains the | character as its delimiter. To read the file, we have passed the | character to the sep parameter as input in the csv() method.

Read Multiple CSV Files into a Single PySpark DataFrame

To read multiple csv files into a pyspark dataframe at once, you can pass the list of filenames to the csv() method as its first input argument. After execution, the csv() method will return the pyspark dataframe with data from all files as shown below.

import pyspark.sql as ps
from pyspark.sql.types import StructType, StructField, IntegerType, StringType
spark = ps.SparkSession.builder \
      .master("local[*]") \
      .appName("readcsv_example") \
      .getOrCreate()
dfs=spark.read.csv(["demo_file.csv","demo_file2.csv"],header=True,inferSchema=True, sep="|")
print("The input csv files are:")
dfs.show()
print("The data type of columns is:")
print(dfs.dtypes)
spark.sparkContext.stop()

Output:

The input csv files are:
+------+----+----------+-----+
|  Name|Roll|  Language|Extra|
+------+----+----------+-----+
|Aditya|   1|    Python|   11|
|   Sam|   2|      Java|   12|
| Chris|   3|       C++|   13|
|  Joel|   4|TypeScript|   14|
|George|  12|        C#|   15|
|  Sean|  13|       SQL|   16|
|   Joe|  14|       PHP|   17|
|   Sam|  15|JavaScript|   18|
+------+----+----------+-----+

The data type of columns is:
[('Name', 'string'), ('Roll', 'int'), ('Language', 'string'), ('Extra', 'int')]

In the above example, we have used the following files.

In the output, you can observe that the contents of the files are stacked horizontally in the order they are passed in the csv() function.

Multiple CSV Files With Different Column Names

If the files that we pass to the csv() method have the same number of columns but different column names, the output dataframe will contain the column names of the first csv file. The data in the columns are stacked by their positions to create the output dataframe. You can observe this in the following example.

import pyspark.sql as ps
from pyspark.sql.types import StructType, StructField, IntegerType, StringType
spark = ps.SparkSession.builder \
      .master("local[*]") \
      .appName("readcsv_example") \
      .getOrCreate()
dfs=spark.read.csv(["demo_file.csv","demo_file2.csv"],header=True,inferSchema=True, sep="|")
print("The input csv files are:")
dfs.show()
print("The data type of columns is:")
print(dfs.dtypes)
spark.sparkContext.stop()

Output:

The input csv files are:
23/07/09 04:54:17 WARN CSVHeaderChecker: CSV header does not conform to the schema.
 Header: Name, Roll, Language, Extra
 Schema: Name, Roll, Language, Ratings
Expected: Ratings but found: Extra
CSV file: file:///home/aditya1117/codes/demo_file2.csv
+------+----+----------+-------+
|  Name|Roll|  Language|Ratings|
+------+----+----------+-------+
|Aditya|   1|    Python|     11|
|   Sam|   2|      Java|     12|
| Chris|   3|       C++|     13|
|  Joel|   4|TypeScript|     14|
|George|  12|        C#|     15|
|  Sean|  13|       SQL|     16|
|   Joe|  14|       PHP|     17|
|   Sam|  15|JavaScript|     18|
+------+----+----------+-------+

The data type of columns is:
[('Name', 'string'), ('Roll', 'string'), ('Language', 'string'), ('Ratings', 'string')]

In the above example, the first csv file has the column names Name, Roll, Language, and Ratings. The second csv file has Extra as the last column instead of Ratings.

In the output, you can observe that the column names of the first csv files are selected as schema. Hence, the csv() function prints a warning when it encounters a different column name.

CSV Files With Different Numbers of Columns in PySpark

If the input files contain a different number of columns, the column names in the schema of the output dataframe are selected from the CSV file with more columns. Here, the rows from the csv file with lesser columns are filled with null values in the extra columns. 

To understand this, let us add an extra column to the demo_file.csv. The updated file is as follows.

Now, let us read both files into a pyspark dataframe using the csv() function.

import pyspark.sql as ps
from pyspark.sql.types import StructType, StructField, IntegerType, StringType
spark = ps.SparkSession.builder \
      .master("local[*]") \
      .appName("readcsv_example") \
      .getOrCreate()
dfs=spark.read.csv(["demo_file2.csv","demo_file.csv"],header=True,inferSchema=True, sep="|")
print("The input csv files are:")
dfs.show()
print("The data type of columns is:")
print(dfs.dtypes)
spark.sparkContext.stop()

Output:

The input csv files are:
23/07/09 04:57:08 WARN CSVHeaderChecker: Number of column in CSV header is not equal to number of fields in the schema:
 Header length: 4, schema size: 5
CSV file: file:///home/aditya1117/codes/demo_file2.csv
+------+----+----------+-------+-----+
|  Name|Roll|  Language|Ratings|Grade|
+------+----+----------+-------+-----+
|Aditya|   1|    Python|     11|    A|
|   Sam|   2|      Java|     12|    A|
| Chris|   3|       C++|     13|   A+|
|  Joel|   4|TypeScript|     14|   A+|
|George|  12|        C#|     15| null|
|  Sean|  13|       SQL|     16| null|
|   Joe|  14|       PHP|     17| null|
|   Sam|  15|JavaScript|     18| null|
+------+----+----------+-------+-----+

The data type of columns is:
[('Name', 'string'), ('Roll', 'string'), ('Language', 'string'), ('Ratings', 'string'), ('Grade', 'string')]

In the above code, the demo_file.csv contains 4 columns. Hence, the column names given in demo_file.csv are selected for the schema despite the fact that we have passed it as the second file to the csv() function. You can also observe that the output pyspark data frame contains the data from the demo_file.csv on the top of the dataframe as the schema is selected from this file.

Conclusion

In this article, we have discussed different ways to read a CSV file in Pyspark. To learn more about pyspark, you can read this article on pyspark vs pandas. You might also like this article on how to create an empty pyspark dataframe.

I hope you enjoyed reading this article. Stay tuned for more informative articles. 

Happy Learning!

The post PySpark Read CSV File With Examples appeared first on PythonForBeginners.com.

Planet Python

Eugene Stoner: The Man Behind the AR-15

https://www.pewpewtactical.com/wp-content/uploads/2023/06/Eugene-Stoner-AR-10.png

Everybody is familiar with the iconic AR-15, but just where did it come from?

To learn the history of the AR-15, you have to first look at the genius behind it…Eugene Stoner.

Eugene Stoner AR-10
Eugene Stoner with AR-10 rifles. (Photo: Small Arms Review)

So, follow along as we talk about Stoner, his life, and what led him to create one of the most notable rifles in history.

Table of Contents

Loading…

The Early Years

Born in 1922, Stoner graduated from high school right in time for the beginning of World War II. Immediately after graduation, he landed a job at Vega Aircraft Company, installing ordnance. It was here that he would first learn about manufacturing arms.

But then Pearl Harbor happened, leading Stoner to join the Marines soon after.

Burning ships at Pearl Harbor
Burning ships at Pearl Harbor

His background in ordnance resulted in him being shipped to the Pacific Theater, where he was involved in aviation ordnance.

After the war, Stoner hopped around from a few different engineering jobs until he landed a position with a small division of the Fairchild Engine and Airplane Corporation known as Armalite.

Stoner’s AR-5 & AR-10

Stoner’s first major accomplishment at Armalite was developing a new survival weapon for U.S. Air Force pilots.

This weapon was designed to easily stow away under an airplane’s seat, and in the event of a crash, a pilot would have a rifle at the ready to harvest small game and serve as an acceptable form of self-defense as well.

AR-5
An AR-5

The result was known as the Armalite Rifle 5 – the AR-5. Though the modern semi-auto version is known as the AR-7, this weapon can still be found in gun cabinets across America.

269

at Palmetto State Armory

Prices accurate at time of writing

Prices accurate at time of writing

Available Coupons

Eugene Stoner had already left his mark but was far from fading into the shadows. He was just getting started.

The AR-10

Stoner continued his work at Armalite, but it wasn’t long until another opportunity appeared for him to change the course of history…the Vietnam War.

In 1955 the U.S. Army put out a notice that they were looking for a new battle rifle. A year later, the Army further defined they wanted the new weapon to fire the 7.62 NATO.

7.62x39mm
7.62x39mm

Tinkering in his garage, Stoner emerged with a prototype for a new rifle not long afterward called the AR-10.

The AR-10 was the first rifle of its kind, as never before had a rifle utilized the materials Stoner had incorporated.

Guns had always been made of wood and steel, but Stoner drew from his extensive history in the aircraft industry, using lightweight aluminum alloys and fiberglass instead.

PSA AR-10 Gen 3
Modern-day AR-10

This made his AR-10 a lighter weapon that could better resist weather.

Unfortunately, Stoner was late to the race, and the M14 was chosen as the Army’s battle rifle of choice instead.

Combat Center shooting team puts rounds down range
Sgt. Maj. Karl Villalino, Combat Center Sergeant Major, aims an M14 service rifle down range during the High Desert Regional Shooting Competition. (Official Marine Corps photo by Lance Cpl. Thomas Mudd/Released)

The designs for the AR-10 were sold to the Dutch instead. Stoner returned to his day job, focusing on the regular rut of daily life.

But then the Army called again…

Eugene Stoner Invents the AR-15

As it turned out, the M14 was too heavy with too much recoil and difficult to control while under full auto.

In addition, the 7.62 NATO was overkill within the jungles of Vietnam. Often the enemy couldn’t be seen beyond 50 yards, meaning that a lighter weapon could still accomplish the job and let soldiers carry more ammunition while on patrol.

Adding further urgency to the need was the Soviet development of the AK-47.

Mikhail Kalashnikov
Respect to the man who invented the AK, Mikhail Kalashnikov

Amid The Cold War, the idea that the communists may have a better battle rifle than American soldiers was concerning.

So, the Army needed a new battle rifle.

Returning to his AR-10 plans, Stoner set to scaling things down. The AR-10 was modified to use the .223 Remington, with the new rifle designated the Armalite Rifle–15 or AR-15.

Popular 5.56 and .223 Ammo
Popular 5.56 and .223 Ammo

However, Armalite didn’t have the resources to produce weaponry on a mass scale, so they sold the designs to Colt.

Colt presented the design to the Army, but Army officials dismissed the design. It seemed they preferred the traditional look and feel of wood and steel over the AR-15’s aluminum and plastic.

The U.S. Air Force Saves the AR-15

But the story doesn’t end there…

At an Independence Day cookout in 1960, a Colt contract salesman showed Air Force General Curtis LeMay an AR-15. Immediately, LeMay set up a series of watermelons to test the rifle.

General Curtis LeMay AF Archive
General Curtis LeMay (Photo: Air Force Archive)

LeMay ended up so impressed with the new gun that the very next year – after his promotion to Chief of Staff – he requested 80,000 AR-15s to replace the Air Force’s antiquated M2 rifles.

His request was denied, and the Army kept supplying American soldiers overseas with the M14.

In 1963, the Army and Marines finally ordered 85,000 AR-15s…redesignated as the M16.

Using an M16 from a tank
An M16 from a tank

M16 Faces Trouble

Immediately, the Army began to fiddle with Stoner’s design. They changed the powder to a design that proved more corrosive and generated much higher pressures.

Also, they added the forward assist (which Stoner hated). Inexplicably, they began to advertise the weapon as “self-cleaning.”

Soldiers with M16s
Soldiers with M16s

They then shipped thousands of rifles – without manuals or cleaning gear – to men in combat overseas. Men trained on an entirely different weapon system.

As expected, American solider began to experience jammed M-16s on the battlefield.

The Stoner 63

By this point, Stoner had left Armalite, served a brief stint as a consultant for Colt, and finally landed a position at Cadillac Gauge (now Textron).

It was there between the years of 1962-1963 that he began designing one of the most versatile firearms designs of its time: the Stoner 63.

Stoner 63
Stoner 63 (Photo: Wikicommons)

The Stoner 63 was a modular system chambered in 5.56 NATO. Stoner crafted this weapon to be something of a Mr. Potato Head. The lower receiver could be transformed into just about anything.

A carbine, rifle, belt-fed SAW, vehicle-mounted weapon, and top-fed light machine gun were all variations of the Stoner 63, which could easily be crafted from the common receiver.

Navy SEAL Vietnam Stoner 63
Navy SEAL in Vietnam keeps his Stoner 63 light machine gun at the ready.

Interchangeable parts were utilized across the platform, and the barrels didn’t need tools to be swapped out. This was the Swiss Army knife of guns. It was truly a game-changer.

The catch was that it didn’t like to work as well on extended missions. There were so many moving parts, with such fine tolerances, that when spending weeks in the muddy jungle with a Stoner 63, the odds of losing a component or having a dirty, jammed gun were dangerous.

SEAL Mk 23 LMG
A member of U.S. Navy SEAL team uses caution as he watches for any movement in the thick wooded area along a stream. October, 1968. (Photo: Wikicommons)

While the system worked wonderfully on quick missions of a few hours, it was deemed too much of a risk for use amongst the basic infantryman.

Despite this, the Stoner 63 still saw widespread use throughout the Special Forces before finally being retired in 1983.

ARES

In 1972, Stoner finally left Cadillac Gauge to start his own company, co-founding ARES with a friend.

Aside from making improvements on the Stoner 63 — with the new model called the Stoner 86 — he also began working on yet another rifle design that sadly never took off, known as the Future Assault Rifle Concept or FARC.

Stoner would continue designing weapons with ARES until he received an offer from Knight’s Armament Company.

Stoner and Kalashnikov
Some have asked if Eugene Stoner, left, and Mikhail Kalashnikov knew each other…and yes! They were reportedly friends. We would love to know what those hangouts were like.

Knight’s Armament Company

Knight’s Armament Company would be the final company where Stoner would produce his legendary work.

Almost immediately, Stoner developed the SR-25 rifle, a more accurate version of the AR-10.

SR 25
SR-25 (Photo: Wikicommons)

The Navy SEALS would finally adopt the weapon in 2000 as their Mark 11 Mod 0 Sniper Weapon. It would see use until finally being phased out 17 years later in 2017.

Another sniper rifle, the KAC SR-50, was also developed but strangely fell to the wayside due to political pressure.

As police departments nationwide began to upgrade their .38 Special revolvers for the new-tech polymer Glock, Stoner jumped into the fray.

Glock 17 (top) and Glock 19 (bottom), Gen 3
Glock 17 (top) and Glock 19 (bottom), Gen 3

He created a polymer-framed, single-stack, striker-fired design that showed great promise.

But the weapon was so unwieldy and inaccurate (engineers had bumped Stoner’s initial 6-pound trigger pull up to 12 pounds) that it was a fiasco. Colt would later pull it from shelves in 1993 over safety issues.

It was yet another frustrating end to what was originally a great design.

Final Thoughts

Eugene Stoner passed away from brain cancer in 1997 in the garage of his Palm City, Florida home.

By the time of his death, there were nearly 100 patents that were filed in his name. Not to mention, he’d revolutionized both the world of firearms and Americans’ ability to defend themselves.

Eugene Stoner
Eugene Stoner

What are your thoughts on Eugene Stoner and his designs? Let us know in the comments below. Want to learn more about other firearms designers? Check out our list of the 5 Most Influential Gun Inventors. Or, for your very own AR-15, check out our list of the top recommended AR-15 models.

The post Eugene Stoner: The Man Behind the AR-15 appeared first on Pew Pew Tactical.

Pew Pew Tactical

8 Ways to Stay on Top of the Latest Trends in Data Science

https://static1.makeuseofimages.com/wordpress/wp-content/uploads/2023/06/documents-on-wooden-surface.jpg

Data science is constantly evolving, with new papers and technologies coming out frequently. As such, data scientists may feel overwhelmed when trying to keep up with the latest innovations.

MUO video of the day

SCROLL TO CONTINUE WITH CONTENT

However, with the right tips, you can stay current and remain relevant in this competitive field. Thus, here are eight ways to stay on top of the latest trends in data science.

1. Follow Data Science Blogs and Newsletters

Data science blogs are a great way to brush up on the basics while learning about new ideas and technologies. Several tech conglomerates produce high-quality blog content where you can learn about their latest experiments, research, and projects. Great examples are Google, Facebook, and Netflix blogs, so waste no time checking them out.

Alternatively, you can look into online publications and individual newsletters. Depending on your experience level and advancement in the field, these blogs may address topics you’d find more relatable. For example, Version Control for Jupyter Notebook is easier for a beginner to digest than Google’s Preference learning for cache eviction.

You can find newsletters by doing a simple search, but we’d recommend Data Elixir, Data Science Weekly, and KDnuggets News, as these are some of the best.

2. Listen to Data Science Podcasts and Watch YouTube Videos

Podcasts are easily accessible and a great option when you’re pressed for time and want to get knowledge on the go. Listening to podcasts exposes you to new data science concepts while letting you carry out other activities simultaneously. Also, using interviews with experts in the field, some podcasts offer a window into the industry and let you learn from professionals’ experiences.

On the other hand, YouTube is a better alternative for audio-visual learners and has several videos at your disposal. Channels like Data School and StatQuest with Josh Starmer cover a wide range of topics for both aspiring and experienced data scientists. They also touch on new trends and methods, so following these channels is a good idea to keep current.

It’s easy to get lost in a sea of podcasts and videos, so carefully select detailed videos and the best podcasts for data science. This way, you can acquire accurate knowledge from the best creators and channels.

3. Learn Data Science Skills and Concepts From Courses and Books

Online courses allow learning from data science academics and experts, who condense their years of experience into digestible content. Recent courses cover several data science necessities, from hard-core machine learning to starting a career in data science without a degree. They may not be cheap, but they are well worth their cost in the value they give.

Additionally, books play an important role as well. Reading current data science books can help you learn new techniques, understand real-world data science applications, and develop critical thinking and problem-solving skills. These books explain in-depth data science concepts you may not find elsewhere.

Such books include The Data Science Handbook, Data Science on the Google Cloud Platform, and Think Bayes. You should also check out a few data science courses on sites like Coursera and Udemy.

4. Meet Industry Experts and Enthusiasts From Events and Communities

Attending conferences ushers you into an environment of like-minded individuals you can connect with. Although talking to strangers may feel uncomfortable, you will learn so much from the people at these events. By staying home, you will likely miss out on networking, job opportunities, and modern techniques like deep learning methods.

Furthermore, presentations allow you to observe other projects and familiarize yourself with the latest trends. Seeing what big tech companies are up to is encouraging and educative, and you can always take away something from them to apply in your work.

Data science events can be physical or virtual. Some good data science events to consider are the Open Data Science Conference (ODSC), Data Science Salon, and the Big Data and Analytics Summit.

5. Participate in Data Science Competitions and Hackathons

Data science hackathons unite data scientists to develop models that solve real-world problems within a specified time frame. They can be hosted by various platforms, such as Kaggle, DataHack, or UN Big Data Hackathon.

Participating in hackathons enhances your mastery and accuracy and exposes you to the latest data science tools and popular techniques for building models. Regardless of your results, competing with other data scientists in hackathons offers valuable insights into the latest advancements in data science.

Consider participating in the NERSC Open Hackathon, BNL Open Hackathon, and other virtual hackathons. Also, don’t forget to register for physical hackathons that may be happening near your location.

6. Contribute to Data Science Open Source or Social Good Projects

Contributing to open-source data science projects lets you work with other data scientists in development. From them, you’ll learn new tools and frameworks used by the data science community, and you can study project codes to implement in your work.

Furthermore, you can collaborate with other data scientists with different perspectives in an environment where exchanging ideas, feedback, and insights is encouraged. You can discover the latest techniques data science professionals use, industry standards, best practices, and how they keep up with data science trends.

First, search for repositories tagged with the data science topic on GitHub or Kaggle. Once you discover a project, consider how to contribute, regardless of your skill level, and start collaborating with other data scientists.

Following data science thought leaders and influencers on social media keep you informed about the latest data science trends. This way, you can learn about their views on existing subject matters and up-to-date news on data science trends. Additionally, it allows you to inquire about complicated subjects and get their reply.

You can take it a step further and follow Google, Facebook, Apple, and other big tech companies on Twitter. This gives you the privilege of knowing tech trends to expect, not only limited to data science.

Kirk Borne, Ronald van Loon, and Ian Goodfellow are some of the biggest names in the data science community. Start following them and big tech companies on Twitter and other social media sites to stay updated.

8. Share Your Data Science Work and Insights

Sharing your work lets you get feedback and suggestions from other data scientists with different experience levels and exposure. Their comments, questions, and critiques can help you stay up-to-date with the latest trends in data science.

You can discover trendy ideas, methods, tools, or resources you may not have known before by listening to their suggestions. For example, a person may unknowingly use an outdated version of Python until he posts his work online and someone points it out.

Sites like Kaggle and Discord have several data science groups through which you can share your work and learn. After signing up and joining a group, start asking questions and interacting with other data scientists. Prioritize knowledge, remember to be humble, and try to build mutually beneficial friendships with other data scientists.

Be a Lifelong Learner in Data Science

Continuous learning is necessary to remain valuable as a data scientist, but it can be difficult to keep up all by yourself. Consequently, you’ll need to find a suitable community to help you, and Discord is one of the best platforms to find one. Find a server with people in the same field, and continue your learning with your new team.

MakeUseOf

The 6 Best Linux Distros for Network Engineers

https://static1.makeuseofimages.com/wordpress/wp-content/uploads/2023/06/man-works-on-laptop-next-to-networking-equipment.jpg

Linux is commonly preferred among network engineers—so if you’ve thought about installing it for your work, you’re not alone.

MUO video of the day

SCROLL TO CONTINUE WITH CONTENT

If you’re a network engineer, it’s easy to wonder which distributions will have the best features for your work. Here are the six best Linux distributions for network engineering:

1. Fedora

Of all the Linux distributions, one of the most highly regarded among network engineers is Fedora—and there’s a simple reason why.

Fedora is an open-source distribution that serves as a community equivalent to Red Hat Enterprise Linux (RHEL). RHEL itself is commonly chosen as the operating system for enterprise-level systems.

As a result, network engineers who use Fedora enjoy a greater level of familiarity with the RHEL systems they may encounter throughout their careers.

Fedora also offers users an incredible arsenal of open-source tools, built-in support for containerized applications, and consistent access to cutting-edge features and software.

Download: Fedora (free)

2. RHEL

rhel 9 desktop view
Image Credit: RedHat/Wikimedia under CC BY-SA 4.0

As one of the most popular enterprise distributions, RHEL is a great option because it is robust and reinforced. Each version of RHEL has a 10-year lifecycle, meaning that you’ll be able to use your chosen version of RHEL (and enjoy little to no compatibility issues) for years.

By using RHEL, you’ll also become familiar with many of the systems you’re likely to encounter on the job.

Many of the qualities of RHEL that make it attractive as an enterprise solution are just as appealing for independent users.

RHEL comes pre-equipped with the SELinux security module, so you will find it easy to get started with managing access controls and system policies. You’ll also have access to tools like Cacti and Snort through the RPM and YUM package managers.

Download: RHEL (free for developers; $179 annually)

3. CentOS Stream

Much like Fedora, CentOS Stream is a distribution that stays in line with the development of RHEL. It serves as the upstream edition of RHEL, meaning that the content in the latest edition of CentOS Stream is likely to appear in RHEL’s next release.

While CentOS Stream may not offer the same stability as Fedora, its enticing inclusion of cutting-edge software makes it worth considering.

CentOS Stream also has a distinct advantage over downstream editions of RHEL following Red Hat’s decision to close public access to the source code of RHEL: it will continue to stay in line with the latest experimental changes considered for the next release of RHEL.

In the future, CentOS Stream is likely to become the best option for anyone seeking an RHEL-adjacent distribution.

Download: CentOS Stream (free)

4. openSUSE

Another powerful and reliable option for network engineers is openSUSE. openSUSE is impressively stable and offers frequent new releases, making it a good option if you prefer to avoid broken packages while still taking advantage of the latest software releases.

Out of the box, you won’t have any issues configuring basic network settings through YaST (Yet another Setup Tool). Many of the packages that come preinstalled with openSUSE can provide you with incredible utility.

Wicked is a powerful network configuration framework, for example, while Samba is perfect for enabling file-sharing between Linux and Windows systems. You won’t have any trouble installing the right tool for a job with openSUSE’s Zypper package manager.

Download: openSUSE (free)

5. Debian

viewing settings and browser in debian
Image Credit: Граймс/Wikimedia under CC BY-SA 4.0

Debian is a widely-renowned Linux distribution known for being incredibly stable and high-performance. Several branches of Debian are available, including Debian Stable (which is extremely secure and prioritizes stability) and Debian Unstable (which is more likely to break but provides access to the newest cutting-edge releases of software).

One of the biggest advantages of using Debian for network engineering is that it has an incredible package-rich repository with over 59,000 different software packages.

If you’re interested in trying out the newest niche and experimental tools in networking and cybersecurity, an installation of Debian will provide you with total access.

Download: Debian (free)

6. Kali Linux

kali desktop view
Image Credit: Gased Basek/Wikimedia under CC BY-SA 4.0

As a distribution designed for penetration testing, Kali Linux comes with a massive variety of preinstalled tools that network engineers are certain to find useful. Wireshark offers tantalizing information about packets moving across a network, Nmap provides useful clues about network security, and SmokePing provides interesting visualizations of network latency.

Not all of the software packaged with Kali Linux is useful for network engineers, but luckily, new Kali installations are completely customizable. You should plan out what packages you intend to use in advance so that you can avoid installing useless packages and keep your Kali system minimally cluttered.

Download: Kali Linux (free)

Familiarize Yourself With Your New Networking Distro

While some Linux distributions are better suited to network engineers, almost any Linux distribution can be used with the right software and configurations.

You should test out software like Nmap and familiarize yourself with networking on your new Linux distro so that lack of familiarity doesn’t become an obstacle later on.

MakeUseOf

Laravel Migrations: “Table already exists” After Foreign Key Failed

https://laraveldaily.com/storage/437/Copy-of-Copy-of-Copy-of-ModelpreventLazyLoading();-(4).png

If you create foreign keys in your migrations, there may be a situation that the table is created successfully, but the foreign key fails. Then your migration is “half successful”, and if you re-run it after the fix, it will say “Table already exists”. What to do?


The Problem: Explained

First, let me explain the problem in detail. Here’s an example.

Schema::create('teams', function (Blueprint $table) {

$table->id();

$table->string('name');

$table->foreignId('team_league_id')->constrained();

$table->timestamps();

});

The code looks good, right? Now, what if the referenced table “team_leagues” doesn’t exist? Or maybe it’s called differently? Then you will see this error in the Terminal:

2023_06_05_143926_create_teams_table ..................................................................... 20ms FAIL

 

Illuminate\Database\QueryException

 

SQLSTATE[HY000]: General error: 1824 Failed to open the referenced table 'team_leagues'

(Connection: mysql, SQL: alter table `teams` add constraint `teams_team_league_id_foreign` foreign key (`team_league_id`) references `team_leagues` (`id`))

But that is only part of the problem. So ok, you realized that the referenced table is called “leagues” and not “team_leagues”. Possible fix options:

  • Either rename the field of “team_league_id” to just “league_id”
  • Or, specify the table ->constrained('leagues')

But the real problem now is the state of the database:

  • The table teams is already created
  • But the foreign key to leagues has failed!

This means there’s no record of this migration success in the “migrations” Laravel system DB table.

Now, the real problem: if you fix the error in the same migration and just run php artisan migrate, it will say, “Table already exists”.

2023_06_05_143926_create_teams_table ...................................................................... 3ms FAIL

 

Illuminate\Database\QueryException

 

SQLSTATE[42S01]: Base table or view already exists:

1050 Table 'teams' already exists

(Connection: mysql, SQL: create table `teams` (...)

So should you create a new migration? Rollback? Let me explain my favorite way of solving this.


Solution: Schema::hasTable() and Separate Foreign Key

You can re-run the migration for already existing tables and ensure they would be created only if they don’t exist with the Schema::hasTable() method.

But then, we need to split the foreignId() into parts because it’s actually a 2-in-1 method: it creates the column (which succeeded) and the foreign key (which failed).

So, we rewrite the migration into this:

if (! Schema::hasTable('teams')) {

Schema::create('teams', function (Blueprint $table) {

$table->id();

$table->string('name');

$table->unsignedBigInteger('team_league_id');

$table->timestamps();

});

}

 

// This may be in the same migration file or in a separate one

Schema::table('teams', function (Blueprint $table) {

$table->foreign('team_league_id')->constrained('leagues');

});

Now, if you run php artisan migrate, it will execute the complete migration(s) successfully.

Of course, an alternative solution would be to go and manually delete the teams table via SQL client and re-run the migration with the fix, but you don’t always have access to the database if it’s remote. Also, it’s not ideal to perform any manual operations with the database if you use migrations. It may be ok on your local database, but this solution above would be universal for any local/remote databases.

Laravel News Links

JK Rowling hater demands she performs a sex act on him, but she ruins his life with a tweet instead

https://www.louderwithcrowder.com/media-library/image.png?id=34222424&width=980

JK Rowling has been the head transphobe in charge for quite some time now. You know how leftist sh*tc*nts get when you express a different opinion on them on the internet. They go right to the -isms and -phobias they accuse you of. In Rowling’s case, she also gets people threatening her life. All for the think-crime of, four years ago, defending someone who was fired for saying boys are boys and girls are girls.

If death threats don’t phase her, what do you think her putting her mouth on your weiner is going to do? One loser found this out the hard way and if he has any friends, I hope they’re all laughing at him now.,

So how did things escalate this quickly? It starts with Maya Forstater, the woman who got fired from her job for "’publishing “offensive” tweets questioning government proposals to allow people to self-identify as the opposite sex.’" Defending this woman is what started Rowling on her road to TERFdom. Forstater had just won a settlement when the organisation "was found to have engaged in unlawful discrimination in its decision not to offer her an employment contract or to renew her visiting fellowship."

Turns out that having a common belief about sex and gender does NOT equal bigotry.

Rowling offered Maya her congratulations.

Which led to Joshua D’silva telling Rowlings to suck his dick. This WAS the link to the tweet. It was literally deleted while I was working on this post after Rowling informed the world that D’silva allegedly had a penis so small, it was barely detectable. How embarrassing.

The whole Rowling thing still cracks me up. She is in no way, shape, or form on our side politically AT ALL. When leftists were "resisting" Trump in 2015-16, they would identify themselves as sects from Hogwarts. We would dare them to read another book.

All it took was one tweet and one single opinion for the left to turn on JK Rowling as if she was literally Voldermort. It’s hilarious. Especially knowing Rowling is sleeping on a giant pile of money as she responds. to each of her haters.

><><><><><><

Brodigan is Grand Poobah of this here website and when he isn’t writing words about things enjoys day drinking, pro-wrestling, and country music. You can find him on the Twitter too.

Facebook doesn’t want you reading this post or any others lately. Their algorithm hides our stories and shenanigans as best it can. The best way to stick it to Zuckerface? Bookmark LouderWithCrowder.com and check us out throughout the day! Also, follow us on Instagram and Twitter.

Louder With Crowder

If The Mandalorian Was The A-Team

https://theawesomer.com/photos/2023/06/mandalorian_a_team_t.jpg

If The Mandalorian Was The A-Team

Link

The Mandalorian has some great action. So did The A-Team. So Nebulous Bee thought it would be fun to combine the two. This edit reimagines the credits for the Star Wars series in the style of the 1980s hit series, with Bo-Katan, Paz Vizsla, The Armorer, and Din Djarin standing in for Templeton Peck  B.A. Baracus, Hannibal Smith, and Howling Mad Murdock.

The Awesomer