Python BeautifulSoup Examples

Python BeautifulSoup Examples

https://ift.tt/2UWwaPZ

Introduction

In this tutorial, we will explore numerous examples of using the BeautifulSoup library in Python. For a better understanding let us follow a few guidelines/steps that will help us to simplify things and produce an efficient code. Please have a look at the framework/steps that we are going to follow in all the examples mentioned below:

  1. Inspect the HTML and CSS code behind the website/webpage.
  2. Import the necessary libraries.
  3. Create a User Agent (Optional).
  4. Send get() request and fetch the webpage contents.
  5. Check the Status Code after receiving the response.
  6. Create a Beautiful Soup Object and define the parser.
  7. Implement your logic.

Disclaimer: This article considers that you have gone through the basic concepts of web scraping. The sole purpose of this article is to list and demonstrate examples of web scraping. The examples mentioned have been created only for educational purposes. In case you want to learn the basic concepts before diving into the examples, please follow the tutorial at this link.

Without further delay let us dive into the examples. Let the games begin!

Example 1: Scraping An Example Webpage

Let’s begin with a simple example where we are going to extract data from a given table in a webpage. The webpage from which we are going to extract the data has been mentioned below:

Link to webpage: https://shubhamsayon.github.io/python/demo_html.html

The code to scrape the data from the table in the above webpage has been given below.

# 1. Import the necessary LIBRARIES
import requests
from bs4 import BeautifulSoup

# 2. Create a User Agent (Optional)
headers = {"User-Agent": "Mozilla/5.0 (Linux; U; Android 4.2.2; he-il; NEO-X5-116A Build/JDQ39) AppleWebKit/534.30 ("
                         "KHTML, like Gecko) Version/4.0 Safari/534.30"}

# 3. Send get() Request and fetch the webpage contents
response = requests.get("https://shubhamsayon.github.io/python/demo_html.html", headers=headers)
webpage = response.content

# 4. Check Status Code (Optional)
# print(response.status_code)

# 5. Create a Beautiful Soup Object
soup = BeautifulSoup(webpage, "html.parser")

# 6. Implement the Logic.
for tr in soup.find_all('tr'):
    topic = "TOPIC: "
    url = "URL: "
    values = [data for data in tr.find_all('td')]
    for value in values:
        print(topic, value.text)
        topic = url
    print()

Output:

TOPIC:  __str__ vs __repr__ In Python
URL:  https://blog.finxter.com/python-__str__-vs-__repr__/

TOPIC:  How to Read a File Line-By-Line and Store Into a List?
URL:  https://blog.finxter.com/how-to-read-a-file-line-by-line-and-store-into-a-list/

TOPIC:  How To Convert a String To a List In Python?
URL:  https://blog.finxter.com/how-to-convert-a-string-to-a-list-in-python/

TOPIC:  How To Iterate Through Two Lists In Parallel?
URL:  https://blog.finxter.com/how-to-iterate-through-two-lists-in-parallel/

TOPIC:  Python Scoping Rules – A Simple Illustrated Guide
URL:  https://blog.finxter.com/python-scoping-rules-a-simple-illustrated-guide/

TOPIC:  Flatten A List Of Lists In Python
URL:  https://blog.finxter.com/flatten-a-list-of-lists-in-python/

✨ Video Walkthrough of The Above Code:

Example 2: Scraping Data From The Finxter Leaderboard

This example shows how we can easily scrape data from the Finxter dashboard which lists the elos/points. The image given below depicts the data that we are going to extract from https://app.finxter.com.

The code to scrape the data from the table in the above webpage has been given below.

# import the required libraries
import requests
from bs4 import BeautifulSoup

# create User-Agent (optional)
headers = {"User-Agent": "Mozilla/5.0 (CrKey armv7l 1.5.16041) AppleWebKit/537.36 (KHTML, like Gecko) "
                         "Chrome/31.0.1650.0 Safari/537.36"}

# get() Request
response = requests.get("https://app.finxter.com/learn/computer/science/", headers=headers)

# Store the webpage contents
webpage = response.content

# Check Status Code (Optional)
print(response.status_code)

# Create a BeautifulSoup object out of the webpage content
soup = BeautifulSoup(webpage, "html.parser")
# The logic
for table in soup.find_all('table',class_='w3-table-all',limit=1):
    for tr in table.find_all('tr'):
        name = "USERNAME: "
        elo = "ELO: "
        rank = "RANK: "
        for td in tr.find_all('td'):
            print(name,td.text.strip())
            name = elo
            elo = rank
        print()

Output: Please download the file given below to view the extracted data as a result of executing the above code.

Example 3: Scraping The Free Python Job Board

Data scraping can prove to be extremely handy while automating searches on Job websites. The example given below is a complete walkthrough of how you can scrape data from job websites. The image given below depicts the website whose data we shall be scraping.

Link to website: http://pythonjobs.github.io/

In the code given below, we will try and extract the job title, location, and company name for each job that has been listed. Please feel free to run the code on your system and visualize the output.

import requests
from bs4 import BeautifulSoup

# create User-Agent (optional)
headers = {"User-Agent": "Mozilla/5.0 (CrKey armv7l 1.5.16041) AppleWebKit/537.36 (KHTML, like Gecko) "
                         "Chrome/31.0.1650.0 Safari/537.36"}

# get() Request
response = requests.get("http://pythonjobs.github.io/", headers=headers)

# Store the webpage contents
webpage = response.content

# Check Status Code (Optional)
# print(response.status_code)

# Create a BeautifulSoup object out of the webpage content
soup = BeautifulSoup(webpage, "html.parser")

# The logic
for job in soup.find_all('section', class_='job_list'):
    title = [a for a in job.find_all('h1')]
    for n, tag in enumerate(job.find_all('div', class_='job')):
        company_element = [x for x in tag.find_all('span', class_='info')]
        print("Job Title: ", title[n].text.strip())
        print("Location: ", company_element[0].text.strip())
        print("Company: ", company_element[3].text.strip())
        print()

Output:

Job Title: Software Engineer (Data Operations)
Location:  Sydney, Australia / Remote
Company:  Autumn Compass

Job Title: Developer / Engineer
Location:  Maryland / DC Metro Area
Company:  National Institutes of Health contracting company.

Job Title: Senior Backend Developer (Python/Django)
Location:  Vienna, Austria
Company:  Bambus.io

✨ Video Walkthrough Of Above Code:

Example 4: Scraping Data From An Online Book Store

Web scraping has a large scale usage when it comes to extracting information about products from shopping websites. In this example, we shall see how we can extract data about books/products from alibris.com.

The image given below depicts the webpage from which we are going to scrape data.

Link to webpage: https://www.alibris.com/search/books/subject/Fiction

The code given below demonstrates how to extract:

  • The name of each Book,
  • The name of the Author,
  • The price of each book.
# import the required libraries
import requests
from bs4 import BeautifulSoup

# create User-Agent (optional)
headers = {"User-Agent": "Mozilla/5.0 (Linux; U; Android 4.2.2; he-il; NEO-X5-116A Build/JDQ39) AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 Safari/534.30"}

# get() Request
response = requests.get(
    "https://www.alibris.com/search/books/subject/Fiction", headers=headers)

# Store the webpage contents
webpage = response.content

# Check Status Code (Optional)
# print(response.status_code)

# Create a BeautifulSoup object out of the webpage content
soup = BeautifulSoup(webpage, "html.parser")

# The logic
for parent in soup.find_all('ul',{'class':'primaryList'}):
    for n,tag in enumerate(parent.find_all('li')):
        title = [x for x in tag.find_all('p', class_='bookTitle')]
        author = [x for x in tag.find_all('p', class_='author')]
        price = [x for x in tag.find_all('a', class_='buy')]
        for item in title:
            print("Book: ",item.text.strip())
        for item in author:
            author = item.text.split("\n")
            print("AUTHOR: ",author[2])
        for item in price:
            if 'eBook' in item.text.strip():
                print("eBook PRICE: ", item.text.strip())
            else:
                print("PRICE: ", item.text.strip())
        print()

Output: Please download the file given below to view the extracted data as a result of executing the above code.

Example 5: Scraping Using Relative Links

Until now we have seen examples where we scraped data directly from a webpage. Now, we will find out how we can extract data from websites that have hyperlinks. In this example, we shall extract data from https://codingbat.com/. Let us try and extract all the questions listed under the Python category in codingbat.com.

The demonstartion given below depicts a sample data that we are going to extract from the website.

Link to website: https://codingbat.com/python

Solution:

# 1. Import the necessary LIBRARIES
import requests
from bs4 import BeautifulSoup

# 2. Create a User Agent (Optional)
headers = {"User-Agent": "Mozilla/5.0 (Linux; U; Android 4.2.2; he-il; NEO-X5-116A Build/JDQ39) AppleWebKit/534.30 ("
                         "KHTML, like Gecko) Version/4.0 Safari/534.30"}

# 3. Send get() Request and fetch the webpage contents
response = requests.get('http://codingbat.com/python', headers=headers)
webpage = response.content

# 4. Check Status Code (Optional)
# print(response.status_code)

# 5. Create a Beautiful Soup Object
soup = BeautifulSoup(webpage, "html.parser")

# The Logic
url = 'https://codingbat.com'
div = soup.find_all('div', class_='summ')
links = [url + div.a['href'] for div in div]
for link in links:
    #print(link)
    second_page = requests.get(link, headers={
        "User-Agent": "Mozilla/5.0 (Linux; U; Android 4.2.2; he-il; NEO-X5-116A Build/JDQ39) AppleWebKit/534.30 ("
                      "KHTML, like Gecko) Version/4.0 Safari/534.30"})
    sub_soup = BeautifulSoup(second_page.content, 'html.parser')
    div = sub_soup.find('div', class_='tabc')
    question = [url + td.a['href'] for td in div.table.find_all('td')]

    for link in question:
        third_page = requests.get(link)
        third_soup = BeautifulSoup(third_page.content, 'html.parser')
        indent = third_soup.find('div', attrs={'class': 'indent'})
        problem = indent.table.div.string
        siblings_of_statement = indent.table.div.next_siblings
        demo = [sibling for sibling in siblings_of_statement if sibling.string is not None]
        print(problem)
        for example in demo:
            print(example)

        print("\n")

Output: Please download the file given below to view the extracted data as a result of executing the above code.

Conclusion

I hope you enjoyed the examples discussed in the article. Please subscribe and stay tuned for more articles and video contents in the future!

Where to Go From Here?

Enough theory, let’s get some practice!

To become successful in coding, you need to get out there and solve real problems for real people. That’s how you can become a six-figure earner easily. And that’s how you polish the skills you really need in practice. After all, what’s the use of learning theory that nobody ever needs?

Practice projects is how you sharpen your saw in coding!

Do you want to become a code master by focusing on practical code projects that actually earn you money and solve problems for people?

Then become a Python freelance developer! It’s the best way of approaching the task of improving your Python skills—even if you are a complete beginner.

Join my free webinar “How to Build Your High-Income Skill Python” and watch how I grew my coding business online and how you can, too—from the comfort of your own home.

Join the free webinar now!

The post Python BeautifulSoup Examples first appeared on Finxter.

Python

via Finxter https://ift.tt/2HRc2LV

November 24, 2020 at 08:55PM

Raising Baby Yoda

Raising Baby Yoda

https://ift.tt/3pW7oh3

Raising Baby Yoda

Link

It might seem like zooming around the galaxy with Baby Yoda is all action and adventure. The Mandalorian’s Din Djarin explains how much work it is to take care of a 50-year-old poop machine in this clip that we’re sure many parents can relate to. A fun and charming fan parody from the guys at The Warp Zone.

fun

via The Awesomer https://theawesomer.com

November 24, 2020 at 06:30PM

Initial Setup with Laravel Breeze

Initial Setup with Laravel Breeze

https://ift.tt/3nIOtEB


Your Ad Here!

Laracasts is a fantastic place to promote your development services to millions of visitors across the globe.

Let’s chat. This ad spot could be yours.

programming

via Laracasts https://ift.tt/1eZ1zac

November 20, 2020 at 03:19PM

How to Add Stripe One-Time Payment Form to Laravel Project

How to Add Stripe One-Time Payment Form to Laravel Project

https://ift.tt/35JYmvE


Payments are one of the most typical elements of any web-project, and Stripe is a payment provider that is really easy to install in Laravel projects. In this article, we will add a payment form to the page.

As an example, we will take a Product Show page from our QuickAdminPanel Product Management module, but you can follow the same instructions and add Stripe form to ANY Laravel project page.

The plan will consist of 8 steps:

  1. Install Laravel Cashier
  2. Run Cashier migrations
  3. Stripe Credentials in .env
  4. User Model should be Billable
  5. Controller: Form Payment Intent
  6. Blade Page: Form, Styles, and Scripts
  7. Controller: Post Payment Processing
  8. After Successful Purchase: Send Product

Let’s begin!


1. Install Laravel Cashier

Run this command:

composer require laravel/cashier

Notice: Currently, the latest version of Cashier is v12. If you’re reading this article when the newer version has arrived, please read its upgrade guide. But personally, I doubt that any fundamentals will change.


2. Run Cashier migrations

Cashier package registers its own database migration directory, so remember to migrate your database after installing the package:

php artisan migrate

Those migrations are not in database/migrations folder, they are inside /vendor. Here are the contents.

1. Four new columns to users table:

Schema::table('users', function (Blueprint $table) {
    $table->string('stripe_id')->nullable()->index();
    $table->string('card_brand')->nullable();
    $table->string('card_last_four', 4)->nullable();
    $table->timestamp('trial_ends_at')->nullable();
});

2. New table subscriptions:

Schema::create('subscriptions', function (Blueprint $table) {
    $table->bigIncrements('id');
    $table->unsignedBigInteger('user_id');
    $table->string('name');
    $table->string('stripe_id');
    $table->string('stripe_status');
    $table->string('stripe_plan')->nullable();
    $table->integer('quantity')->nullable();
    $table->timestamp('trial_ends_at')->nullable();
    $table->timestamp('ends_at')->nullable();
    $table->timestamps();

    $table->index(['user_id', 'stripe_status']);
});

3. New table subscription_items:

Schema::create('subscription_items', function (Blueprint $table) {
    $table->bigIncrements('id');
    $table->unsignedBigInteger('subscription_id');
    $table->string('stripe_id')->index();
    $table->string('stripe_plan');
    $table->integer('quantity');
    $table->timestamps();

    $table->unique(['subscription_id', 'stripe_plan']);
});

3. Stripe Credentials in .env

There are two Stripe credentials that you need to add in your .env file:

STRIPE_KEY=pk_test_xxxxxxxxx
STRIPE_SECRET=sk_test_xxxxxxxxx

Where to get those “key” and “secret”? In your Stripe Dashboard:

Stripe Dashboard Laravel Keys env

Keep in mind, there are two “modes” of Stripe keys: testing and live keys. While on your local or testing servers, please remember to use TESTING keys, you can view them by toggling “View Testing Data” on the left menu:

Stripe view testing data

Another way to know if you’re using testing/live keys: the testing keys start with sk_test_ and pk_test_, and live keys start with sk_live_ and pk_live_. Also, live keys won’t work without SSL certificate enabled.

Notice: if you work in a team, when you add new variables, it’s a very good practice to also add them with empty values in .env.example. Then your teammates will know what variables are needed on their server. Read more here.


4. User Model should be Billable

Simple step: in your User model, add Billable trait from Cashier:

app/Models/User.php:

// ...
use Laravel\Cashier\Billable;

class User extends Authenticatable
{
    use HasFactory, Billable;

5. Controller: Form Payment Intent

To enable the Stripe payment form, we need to create a thing called “payment intent” and pass it to the Blade.

In this case, we will add it to ProductController method show():

class ProductController extends Controller
{
    // ...

    public function show(Product $product)
    {
        $intent = auth()->user()->createSetupIntent();

        return view('frontend.coupons.show', compact('product', 'intent'));
    }

Method createSetupIntent() comes from the Billable trait that we added just above in User model.


6. Blade Page: Form, Styles, and Scripts

This is the form that we will add from Stripe, with cardholder name, card number, expiry month/year, CVV code, and ZIP code.

Stripe payment form in Laravel

Luckily, Stripe documentation tells us exactly what HTML/JavaScript/CSS code should be added.

So, in our show.blade.php, we add this:

<form method="POST" action="" class="card-form mt-3 mb-3">
    @csrf
    <input type="hidden" name="payment_method" class="payment-method">
    <input class="StripeElement mb-3" name="card_holder_name" placeholder="Card holder name" required>
    <div class="col-lg-4 col-md-6">
        <div id="card-element"></div>
    </div>
    <div id="card-errors" role="alert"></div>
    <div class="form-group mt-3">
        <button type="submit" class="btn btn-primary pay">
            Purchase
        </button>
    </div>
</form>

All the input variables are exactly as Stripe suggests it, the only element that you would need to change is the route, where the form would be posted, so this:

route('products.purchase', $product->id)

We will create that route and Controller method in the next step.

Meanwhile, we also need to include Stripe’s Styles and JavaScript.
Let’s imagine that in your main Blade file, you have @yield sections for styles and scripts, like this:

<!DOCTYPE html>
<html>

<head>
    ...

    @yield('styles')
</head>

<body>
    ...

    
    @yield('scripts')
</body>
</html>

Then, in our show.blade.php, we may fill in those sections, with code from Stripe:

@section('styles')
<style>
    .StripeElement {
        box-sizing: border-box;
        height: 40px;
        padding: 10px 12px;
        border: 1px solid transparent;
        border-radius: 4px;
        background-color: white;
        box-shadow: 0 1px 3px 0 #e6ebf1;
        -webkit-transition: box-shadow 150ms ease;
        transition: box-shadow 150ms ease;
    }
    .StripeElement--focus {
        box-shadow: 0 1px 3px 0 #cfd7df;
    }
    .StripeElement--invalid {
        border-color: #fa755a;
    }
    .StripeElement--webkit-autofill {
        background-color: #fefde5 !important;
    }
</style>
@endsection

@section('scripts')
<script src="https://js.stripe.com/v3/"></script>
<script>
    let stripe = Stripe("")
    let elements = stripe.elements()
    let style = {
        base: {
            color: '#32325d',
            fontFamily: '"Helvetica Neue", Helvetica, sans-serif',
            fontSmoothing: 'antialiased',
            fontSize: '16px',
            '::placeholder': {
                color: '#aab7c4'
            }
        },
        invalid: {
            color: '#fa755a',
            iconColor: '#fa755a'
        }
    }
    let card = elements.create('card', {style: style})
    card.mount('#card-element')
    let paymentMethod = null
    $('.card-form').on('submit', function (e) {
        $('button.pay').attr('disabled', true)
        if (paymentMethod) {
            return true
        }
        stripe.confirmCardSetup(
            "",
            {
                payment_method: {
                    card: card,
                    billing_details: {name: $('.card_holder_name').val()}
                }
            }
        ).then(function (result) {
            if (result.error) {
                $('#card-errors').text(result.error.message)
                $('button.pay').removeAttr('disabled')
            } else {
                paymentMethod = result.setupIntent.payment_method
                $('.payment-method').val(paymentMethod)
                $('.card-form').submit()
            }
        })
        return false
    })
</script>

Inside of those sections, we’re adding two variables from the back-end:

env('STRIPE_KEY')

and

$intent->client_secret

So make sure you added them in the previous steps.


7. Controller: Post Payment Processing

Remember the route that we called in the previous step? Time to create it.

In routes/web.php, add this:

Route::post('products/{id}/purchase', 'ProductController@purchase')->name('products.purchase');

And then, let’s create a method in ProductController:

public function purchase(Request $request, Product $product)
{
    $user          = $request->user();
    $paymentMethod = $request->input('payment_method');

    try {
        $user->createOrGetStripeCustomer();
        $user->updateDefaultPaymentMethod($paymentMethod);
        $user->charge($product->price * 100, $paymentMethod);        
    } catch (\Exception $exception) {
        return back()->with('error', $exception->getMessage());
    }

    return back()->with('message', 'Product purchased successfully!');
}

So what is happening here?

1. We’re getting payment_method from the form (Stripe handles it in the background for us)
2. Then we call the Cashier methods to get/create the customer, set their payment method, and charge them.
3. Finally, we redirect back with success result
3b. If something goes wrong, try/catch block handles it and redirects back with an error.

Notice: variable $product->price is the price for your product, and we need to multiply it by 100 because Stripe charge is happening in cents.

To show the success message or errors, in your Blade file, you need to add something like this:

@if(session('message'))
    <div class="alert alert-success" role="alert"></div>
@endif
@if(session('error'))
    <div class="alert alert-danger" role="alert"></div>
@endif

8. After Successful Purchase: Send Product

After the customer paid for the product, you need to deliver the order. Of course, it depends on what they purchased and that code is very individual, but I will show you where to put it.

In fact, there are two ways. Easier but less secure, or harder and more secure.

Option 1. Fulfill Order in ProductController

You can do that directly in the same method:

public function purchase(Request $request, Product $product)
{
    $user          = $request->user();
    $paymentMethod = $request->input('payment_method');

    try {
        $user->createOrGetStripeCustomer();
        $user->updateDefaultPaymentMethod($paymentMethod);
        $user->charge($product->price * 100, $paymentMethod);        
    } catch (\Exception $exception) {
        return back()->with('error', $exception->getMessage());
    }

    // Here, complete the order, like, send a notification email
    $user->notify(new OrderProcessed($product)); 

    return back()->with('message', 'Product purchased successfully!');
}

Easy, right? The problem with that method is that it’s happening in sync, which means that $user->charge() may not be actually successfully finished, by the time you fulfill the order. In theory, it may cause false order deliveries with unsuccessful charges.

Option 2. Stripe Webhooks

Or, a more reliable method, is to catch so-called Stripe Webhooks. They ensure that the charge happened successfully, in the correct way. Whenever something happens in Stripe, they send a POST request to your server’s URL that you provide in the Stripe dashboard.

You can catch a lot of events from Stripe, and one of those events is charge.succeeded.

For that, I would recommend using a package called Laravel Stripe Webhooks, I’ve shot a separate video about it:

So if you want to catch more events, and not only charge success, I advise you to use Stripe Webhooks. Keep in mind they won’t (easily) work on your local computer, you need to set up a real domain that Stripe would call.


That’s it! Wish you to receive a lot of successful payments in your projects.

programming

via Laravel News Links https://ift.tt/2dvygAJ

November 19, 2020 at 08:57PM

The Best Snow Shovel

The Best Snow Shovel

https://ift.tt/2Kq6M31

The Best Snow Shovel

We’ve been shoveling ourselves out of average snowstorms and heavy nor’easters with the True Temper 18-Inch Ergonomic Mountain Mover every winter since 2013. It’s the best snow shovel for most people looking to clear walkways, steps, and small driveways. No other shovel matches its unique blend of ideal size, ergonomics, durability, and availability.

technology

via Wirecutter: Reviews for the Real World https://ift.tt/36ATBn1

November 19, 2020 at 03:40PM

Encrypt and Decrypt Data Using Keys

Encrypt and Decrypt Data Using Keys

https://ift.tt/3lNceuu

Crypto is a package by Spatie that allows you to easily generate public/private key pairs and then encrypt/decrypt messages using those keys.

The post Encrypt and Decrypt Data Using Keys appeared first on Laravel News.


Join the Laravel Newsletter to get Laravel articles like this directly in your inbox.

programming

via Laravel News https://ift.tt/14pzU0d

November 19, 2020 at 09:13AM

Paracord Tutorial: Monkey Fist Stinger Impact Tool

Paracord Tutorial: Monkey Fist Stinger Impact Tool

https://ift.tt/3nCbuJq

(Image: Screenshot from video)

This is a pretty cool little tutorial on tying/weaving/braiding paracord to create a short length of cord with a handle on one end (complete with clip or key ring) and a paracord-encased steel ball on the other end. This “impact tool” can be called by many names, including monkey’s fist, mini-monkey fist, and stinger — and it looks as if, in the right hands, it could be an effective tool for self defense.

This is a good teaching video from “The Weavers of Eternity Paracord” and I have no doubt I could do this if I wished to… and I just might. I really like how some simple wraps can securely encase a steel sphere… pretty cool.

Ever make one of these, or some other paracord creation(s)? Let us know in the comments section.

The post Paracord Tutorial: Monkey Fist Stinger Impact Tool appeared first on AllOutdoor.com.

guns

via All Outdoor https://ift.tt/2yaNKUu

November 18, 2020 at 01:03PM

OpenAPI-backed API testing in PHP projects – a Laravel example

OpenAPI-backed API testing in PHP projects – a Laravel example

https://ift.tt/38OHjdL


OpenAPI-backed API testing in PHP projects – a Laravel example



Last updated: 2020-11-14 ::
Published: 11/11/2020
:: [ history ]

PHP and OpenAPI

Am I proud of this montage? You bet I am.

OpenAPI is a specification intended to describe RESTful APIs in JSON and YAML, with the aim of being understandable by humans and machines alike.

OpenAPI definitions are language-agnostic and can be used in a lot of different ways:

An OpenAPI definition can be used by documentation generation tools to display the API, code generation tools to generate servers and clients in various programming languages, testing tools, and many other use cases.

The OpenAPI Specification

In this article, we will see how to combine OpenAPI 3.0.x definitions with integration tests to validate whether an API behaves the way it’s supposed to, using the OpenAPI HttpFoundation Testing package.

We will do so in a fresh Laravel installation, for which we’ll also generate a Swagger UI documentation using the L5 Swagger package.

I will first elaborate a bit further on why this is useful, but if you’re just here for the code, you’re welcome to skip ahead and go to the Laravel example section straight away.

The issue

APIs are pretty common nowadays, and when we’re lucky they come with some form of documentation that helps us find our way around the endpoints. These documentations come in many shapes and flavours (some tastier than others), but one thing they’ve got in common in that they need to be updated every time the API they describe changes.

To many developers, maintaining an API’s documentation feels like extra homework when they’ve already passed the exam; it’s boring, sometimes tedious, and often unrewarding. Some strategies can help, like using annotations to keep the code and the documentation in one place; but those are often annoying to write still, and even the most willing developer is not immune to an oversight that won’t necessarily be caught by coworkers.

The usual outcome is that, one way or another, the documentation and the API become out of sync, leading to confused consumers.

Another aspect of API maintenance is ensuring that no endpoint stops functioning the way it’s supposed to; regressions will be introduced eventually, and without a proper testing strategy they might go unnoticed for a while.

A way to avoid this is to implement integration tests that will automatically check that the API’s behaviour is correct, and that recently introduced changes have not had unintended consequences. This is fine, but still doesn’t provide any guarantee that the expectations set in the integration tests are exactly the same as the ones displayed by the documentation.

If only there was a way to ensure that they perfectly reflect each other…

A solution

We are now assuming that we’ve got an API documentation and some integration tests, and we’d like to align their expectations somehow.

The OpenAPI specification has become a popular choice to describe APIs over time, but whether we use it or not doesn’t change the fact that the corresponding definitions need to be maintained; in other words, using OpenAPI does not automagically make the aforementioned issues go away.

What sets OpenAPI apart, however, is that it’s used as the base layer for a growing number of tools that make the specification useful far beyond the mere documenting side of things.

One of these tools built for the PHP ecosystem and maintained by The PHP League is OpenAPI PSR-7 Message Validator, a package for validating HTTP requests and responses implementing the PSR-7 standard against OpenAPI definitions.

The idea is essentially to take HTTP requests and responses, and make sure they match one of the operations described in an OpenAPI definition.

Can you see where this is going?

We could basically use this package to add an extra layer on top of our integration tests, that will take the API responses obtained in the tests and make sure they match the OpenAPI definitions describing our API.
If they don’t, the tests fail.

This is what it looks like as a fancy diagram:

OpenAPI, API and tests relationship

The OpenAPI definition describes the API, and the tests use the OpenAPI definition to make sure the API actually behaves the way the definition says it does.

All of a sudden, our OpenAPI definition becomes a reference for both our code and our tests, thus acting as the API’s single source of truth.

PSR-7

You might have noticed a small detail in the previous section: the OpenAPI PSR-7 Message Validator package only works for – it’s in the name – PSR-7 messages. The issue here is that not all frameworks support this standard out of the box; as a matter of fact, a lot of them use Symfony’s HttpFoundation component under the hood, whose requests and responses do not implement that standard by default.

The Symfony folks have got us covered though, as they’ve developed a bridge that converts HttpFoundation objects to PSR-7 ones, as long as it’s given a PSR-7 and PSR-17 factory to do so, for which they suggest to use Tobias Nyholm‘s PSR-7 implementation.

All of these pieces form a jigsaw puzzle that the OpenAPI HttpFoundation Testing package offers to assemble for us, allowing developers to back their integration tests with OpenAPI definitions in projects leveraging the HttpFoundation component.

Let’s see how to use it in a Laravel project, which falls into this category.

A Laravel example

The code contained in this section is also available as a GitHub repository.

First, let’s create a new Laravel 8 project, using Composer:

$ composer create-project --prefer-dist laravel/laravel openapi-example "8.*"

Enter the project’s root folder and install a couple of dependencies:

$ cd openapi-example
$ composer require --dev osteel/openapi-httpfoundation-testing
$ composer require darkaonline/l5-swagger

The first one is the OpenAPI HttpFoundation Testing package mentioned earlier, that we install as a development dependency as it’s intended to be used as part of our test suite.

The second one is L5 Swagger, a popular package bringing Swagger PHP and Swagger UI to Laravel. We actually don’t need Swagger PHP here, as it uses Doctrine annotations to generate OpenAPI definitions and we’re going to write our own manually instead. We do need Swagger UI, however, and the package conveniently adapts it to work with Laravel.

To make sure Swagger PHP doesn’t overwrite the OpenAPI definition, let’s set the following environment variable in the .env file at the root of the project:

L5_SWAGGER_GENERATE_ALWAYS=false

Create a file named api-docs.yaml in the storage/api-docs folder (which you need to create), and add the following content to it:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
openapi: 3.0.3

info:
  title: OpenAPI HttpFoundation Testing Laravel Example
  version: 1.0.0

servers:
  - url: http://localhost:8000/api

paths:
  '/test':
    get:
      responses:
        '200':
          description: Ok
          content:
            application/json:
              schema:
                type: object
                required:
                    - foo
                properties:
                  foo:
                    type: string
                    example: bar

This is a simple OpenAPI definition describing a single operation – a GET request on the /api/test endpoint, that should return a JSON object containing a required foo key.

Let’s check whether Swagger UI displays our OpenAPI definition correctly. Start PHP’s development server with this artisan command, to be run from the project’s root:

Open localhost:8000/api/documentation in your browser and replace api-docs.json with api-docs.yaml in the navigation bar at the top (this is so Swagger UI loads up the YAML definition instead of the JSON one, as we haven’t provided the latter).

Hit the enter key or click Explore – our OpenAPI definition should now be rendered as a Swagger UI documentation:

Swagger UI

Expand the /test endpoint and try it out – it should fail with a 404 Not Found error, because we haven’t implemented it yet.

Let’s fix that now. Open the routes/api.php file and replace the example route with this one:

Route::get('/test', function (Request $request) {
    return response()->json(['foo' => 'bar']);
});

Go back to the Swagger UI tab and try the endpoint again – it should now return a successful response.

Time to write a test! Open tests/Feature/ExampleTest.php and replace its content with this one:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
<?php

namespace Tests\Feature;

use Osteel\OpenApi\Testing\ResponseValidatorBuilder;
use Tests\TestCase;

class ExampleTest extends TestCase
{
    /**
     * A basic test example.
     *
     * @return void
     */
    public function testBasicTest()
    {
        $response = $this->get('/api/test');

        $validator = ResponseValidatorBuilder::fromYaml(storage_path('api-docs/api-docs.yaml'))->getValidator();

        $result = $validator->validate('/test', 'get', $response->baseResponse);

        $this->assertTrue($result);
    }
}

Let’s unpack this a bit. For those unfamiliar with Laravel, $this->get() is a test method provided by the MakesHttpRequests trait that essentially performs a GET request on the provided endpoint, executing the request’s lifecycle without leaving the application. It returns a response that is identical to the one we would obtain if we’d perform the same request from the outside.

We then create a validator using the Osteel\OpenApi\Testing\ResponseValidatorBuilder class, to which we feed the YAML definition we wrote earlier via the fromYaml static method (the storage_path function is a helper returning the path to the storage folder, where we stored the definition).

Had we had a JSON definition instead, we could have used the fromJson method; also, both methods accept YAML and JSON strings respectively, as well as files.

The builder returns an instance of Osteel\OpenApi\Testing\ResponseValidator, on which we call the get method, passing the path and the response as parameters ($response is a Illuminate\Testing\TestResponse object here, which is a wrapper for the underlying HttpFoundation object, which can be retrieved through the baseResponse public property).

The above is basically the equivalent of saying I want to validate that this response conforms to the OpenAPI definition of a GET request on the /test path.

It could also be written this way:

$result = $validator->get('/test', $response->baseResponse);

That’s because the validator has a shortcut method for each of the HTTP methods supported by OpenAPI (GET, POST, PUT, PATCH, DELETE, HEAD, OPTIONS and TRACE), to make it simpler to test responses for the corresponding operations.

Note that the specified path must exactly match one of the OpenAPI definition’s paths.

You can now run the test, which should be successful:

$ ./vendor/bin/phpunit tests/Feature

Open routes/api.php again, and change the route for this one:

Route::get('/test', function (Request $request) {
    return response()->json(['baz' => 'bar']);
});

Run the test again; it should now fail, because the response contains baz instead of foo, and the OpenAPI definition says the latter is expected.

Our test is officially backed by OpenAPI!

The above is obviously an oversimplified example for the sake of the demonstration, but in a real situation a good practice would be to overwrite the MakesHttpRequests trait’s call method, so it performs both the test request and the OpenAPI validation.

As a result, our test would now be a single line:

This could be implemented as a new MakesOpenApiRequests trait that would “extend” the MakesHttpRequests one, and that would first call the parent call method to get the response. It would then work out the path from the URI, and validate the response against the OpenAPI definition before returning it, for the calling test to perform any further assertions as needed.

Conclusion

While the above setup is a great step up in improving an API’s robustness, it is no silver bullet; it requires that every single endpoint is covered with integration tests, which is not easily enforceable in an automated way, and ultimately still requires some discipline and vigilance from the developers. It may even feel a bit coercive at first, since as a result they are basically forced to maintain the documentation in order to write successful tests.

The added value, however, is that said documentation is now guaranteed to be accurate, leading to happy consumers who will now feel the joy of using an API that just works; this, in turn, should lead to less frustrated developers, who shall spend less time hunting down pesky discrepancies.

All in all, making OpenAPI definitions the single source of truth for both the API documentation and the integration tests is in itself a strong incentive to keep them up to date; they naturally become a priority, where they used to be an afterthought.

As for maintaining the OpenAPI definition itself, doing so manually can admittedly feel a bit daunting. Annotations are a solution, but I personally don’t like them and prefer to maintain a YAML file directly. IDE extensions like this VSCode one make it much easier, but if you can’t bear the sight of a YAML or JSON file, you can also use tools like Stoplight Studio to do it through a more user-friendly interface (note: I am not affiliated).

And since we’re talking about Stoplight, this article about API Design-First vs Code First by Phil Sturgeon is a good starting point for API documentation in general, and might help you choose an approach to documenting that suits you.


Last updated by osteel on the

:: [
openapi
laravel
api
testing
]

programming

via Laravel News Links https://ift.tt/2dvygAJ

November 17, 2020 at 08:54PM