Keep your testing database separate from development

https://masteringlaravel.io/images/laravel-tip.png

In reply to last week’s tip on having Laravel automatically run seeders in your tests, someone asked "How do you set up the test so it doesn’t blow out your real database?"

The answer is to configure your tests to use a different database than you use for local development. But how do you accomplish that?

First, think about how we tell Laravel which database to use at all? We use environment variables like DB_HOST and DB_DATABASE. So what we really want is for DB_DATABASE to have a different value inside our tests than in a normal application request.

Laravel will look for an .env.testing file, and use those values during a test run, but I prefer to use the phpunit.xml for all testing-related configuration.
By default, Laravel sets some values in our phpunit.xml for testing configuration. For example, it sets APP_ENV to testing and it drops BCRYPT_ROUNDS to 4 to make our tests a little faster.

We can use this same file to change our DB_DATABASE value. If my development database is called app, then I’d have DB_DATABASE set to app_test in the phpunit.xml.

<php>
    <env name="DB_DATABASE" value="app_test"/>
</php>

Related to this, we’ll also forcibly set any env values used by third party services (AWS, Stripe, etc) to purposely-invalid values. This prevents our tests from ever accidentally hitting a third-party service.

<php>
    <env name="STRIPE_KEY" value="do-not-use"/>
    <env name="STRIPE_SECRET" value="do-not-use"/>
</php>

Here to help,

Joel

P.S. Do you have any questions about testing? Hit reply and ask away!

Laravel News Links

How to process large CSV files with Laravel

https://picperf.io/https://laravelnews.s3.amazonaws.com/featured-images/large-csv.png

How to process large CSV files with Laravel

Dealing with hefty CSV files is pretty standard in the business world, especially when you’ve got loads of data to analyze, report on, or move around. If you’re using Laravel and need to process large CSV files, you’ve come to the right place. We will guide you through the smoothest way to handle this task without causing a traffic jam in your application’s performance.

Memory and Performance

First off, let’s talk about the elephant in the room: memory and performance. Chugging through a massive CSV can be a memory hog and could slow down your app. Sure, you might think about just cranking up the memory limit or extending the timeout period. But let’s be honest, that’s like putting a band-aid on a leaky pipe – not the best solution.

Enter Simple Excel by Spatie

Instead of the band-aid approach, we’re going to use a nifty package called Simple Excel by Spatie. If you’re nodding because you expected Spatie to have a solution, you’re not alone.

composer require spatie/simple-excel

Assuming you’ve got your CSV file ready to go, we’ll use SimpleExcelReader to load it up. The cool thing is, by default, it returns you a LazyCollection – think of it as a more considerate way to handle your data without exhausting your server’s memory. This means you can process the file bit by bit, keeping your app light on its feet.

$rows is an instance of Illuminate\Support\LazyCollection

Laravel Jobs to the Rescue

Now, before we dive into code, let’s set up a Laravel Job to manage our CSV processing.

php artisan make:job ImportCsv

Now here is what our ImportCsv job looks like:

<?php

namespace App\Jobs;

use Illuminate\Bus\Queueable;
use Illuminate\Contracts\Queue\ShouldQueue;
use Illuminate\Foundation\Bus\Dispatchable;
use Illuminate\Queue\InteractsWithQueue;
use Illuminate\Queue\SerializesModels;
use Spatie\SimpleExcel\SimpleExcelReader;

class ImportCsv implements ShouldQueue
{
    use Dispatchable, InteractsWithQueue, Queueable, SerializesModels;

    /**
     * Create a new job instance.
     */
    public function __construct()
    {
        //
    }

    /**
     * Execute the job.
     */
    public function handle(): void
    {
        SimpleExcelReader::create(storage_path('app/public/products.csv'))
            ->useDelimiter(',')
            ->useHeaders(['ID', 'title', 'description'])
            ->getRows()
            ->chunk(5000)
            ->each(
// Here we have a chunk of 5000 products
);
    }
}

Here’s the game plan:

  1. Chunking the CSV: We’re going to break that file into manageable pieces, giving us a LazyCollection to play with.
  2. Job Dispatching: For each chunk, we’ll send out a job. This way, we’re processing in batches, which is way easier on your server.
  3. Database Insertion: Each chunk will then be inserted into the database, nice and easy.

Chunking the CSV

With our LazyCollection ready, we’ll slice the CSV into chunks. Think of it like turning a gigantic sandwich into bite-sized pieces – much easier to handle.

php artisan make:job ImportProductChunk

For every piece of the CSV, we’ll create and fire off a job. These jobs are like diligent workers, each taking a chunk and carefully inserting the data into your database.

<?php

namespace App\Jobs;

use App\Models\Product;
use Illuminate\Bus\Queueable;
use Illuminate\Contracts\Queue\ShouldBeUnique;
use Illuminate\Contracts\Queue\ShouldQueue;
use Illuminate\Database\Eloquent\Model;
use Illuminate\Foundation\Bus\Dispatchable;
use Illuminate\Queue\InteractsWithQueue;
use Illuminate\Queue\SerializesModels;
use Illuminate\Support\Str;

class ImportProductChunk implements ShouldBeUnique, ShouldQueue
{
    use Dispatchable, InteractsWithQueue, Queueable, SerializesModels;

    public $uniqueFor = 3600;

    /**
     * Create a new job instance.
     */
    public function __construct(
public $chunk
) {
        //
    }

    /**
     * Execute the job.
     */
    public function handle(): void
    {
        $this->chunk->each(function (array $row) {
            Model::withoutTimestamps(fn () => Product::updateOrCreate([
                'product_id' => $row['ID'],
'title' => $row['title'],
'description' => $row['description'],
           ]));
        });
    }

    public function uniqueId(): string
    {
        return Str::uuid()->toString();
    }
}

Ensuring Uniqueness

One crucial thing to remember is to use $uniqueFor and uniqueId in your jobs. It’s like giving each worker a unique ID badge, so you don’t accidentally have two people doing the same job – a big no-no for efficiency.

Dispatching Jobs

Back in our ImportCsv job, we’ll dispatch a job for each chunk within the each method. It’s like saying, "You get a chunk, and you get a chunk – everybody gets a chunk!"

<?php

namespace App\Jobs;

use Illuminate\Bus\Queueable;
use Illuminate\Contracts\Queue\ShouldQueue;
use Illuminate\Foundation\Bus\Dispatchable;
use Illuminate\Queue\InteractsWithQueue;
use Illuminate\Queue\SerializesModels;
use Spatie\SimpleExcel\SimpleExcelReader;

class ImportCsv implements ShouldQueue
{
    use Dispatchable, InteractsWithQueue, Queueable, SerializesModels;

    /**
     * Create a new job instance.
     */
    public function __construct()
    {
        //
    }

    /**
     * Execute the job.
     */
    public function handle(): void
    {
        SimpleExcelReader::create(storage_path('app/public/products.csv'))
            ->useDelimiter(',')
            ->useHeaders(['ID', 'title', 'description'])
            ->getRows()
            ->chunk(5000)
            ->each(
fn ($chunk) => ImportProductChunk::dispatch($chunk)
);
    }
}

And there you have it! Your chunks are off to be processed independently, without any memory drama. If you’re in a rush, just add more workers, and like a well-oiled machine, your data will be processed even quicker.

Processing large CSV files in Laravel doesn’t have to be a headache. With the right tools and approach, you can keep your application running smoothly while dealing with all that data.

Happy coding!


The post How to process large CSV files with Laravel appeared first on Laravel News.

Join the Laravel Newsletter to get all the latest Laravel articles like this directly in your inbox.

Laravel News

Improve Your Upwork Job Search with RSS and Data Scraping

https://s.w.org/images/core/emoji/14.0.0/72×72/2705.png

Rate this post

???? TLDR

Freelancers that use Upwork have an advantage if they apply to job soon after they are posted.

Upwork offers an RSS feed that can be parsed for job information sent in the jobs broadcast.

Feedparser is a python module that can be used to extract some of the key data from the XML data in that RSS field.

Some of the data in the feed is more deeply embedded and so must be extracted and cleaned before use.

By combining the extracted data into a Pandas DataFrame there is the ability to filter data and save to a more useful format.

At the end of this article, I’ll provide an interactive Google Colab link for the interactive version of this article. But let’s start with the video: ????

YouTube Video

The Coding Challenge

Upwork is seen to be a good platform for potential freelance jobs.

But there can be some challenges in getting to the jobs quickly enough. Early applications are frequently the ones accepted.

The job search interface is also not very well suited to the filtering and listing of jobs you are looking for.

And…
Freelancers need to actively search the jobs page!

This tutorial and video look at a way to accelerate that search and filter the jobs on preferred criteria.

Learning Objectives

By the end of this tutorial, you will have:

  • Defined the project requirements
  • Explored aspects of Data Scraping
  • Explored RSS feeds and XML
  • Followed a potentially useful and repeatable workflow
  • Built a useful tool
  • Developed some useful Python skills

Approach

As freelancers, it can be helpful to approach every task as a formal project.
It is good practise and you never know when project might become something more valuable.

A client may want something similar or it may become a product you can sell.
So, it is a good discipline and will save time in the long run to approach such projects professionally.

A Useful Workflow

This pattern of development has proven helpful for me:

  • Set the project requirements
  • Follow a sound process for data scraping
    • Investigate the data source
    • Acquire the data
    • Extract the data you want
    • Clean the data
    • Filter the data
    • Output the data
    • Use the data and confirm the information is valid
  • Document the project
  • Deliver to the client

And we’ll follow this process now.

Requirements Setting

Just like we do for our clients, we should have specific requirements.

I use the MoSCow approach to setting my requirements.

This identifies the parts that the project we:

  • Must Do
  • Should Do
  • Could Do
  • Won’t Do

And sets out clearly what will be delivered and equally as important, what will not be delivered.

Our requirements:

MUST:

  1. Provide data from Upwork relevant to the Freelancer
  2. Present the information in a readable format

SHOULD:

  1. Allow filtering and manipulation of the data as needed by the user
  2. Allow for rapid refresh

COULD:

  1. Run from the command line with arguments
  2. Could be automated

WONT:

  1. Have a graphical interface

We are limited on time. So we will focus on the data-scraping aspect of the task. And we will only complete the Must and Should requirements.

Getting the Data

Investigation

We can see here on the Upwork ‘search page’, entering a search term will give you a large number of potential tasks. But we only want some of these and preferably the latest. And we want them filtered to our needs. So we need to defer from what is presented here.

This symbol identifies the Really Simple Syndication (RSS) feed that we shall be using that feed for our data.

If we click the link and select RSS, a new page opens with the job feed structured in the Extensible Markup Language (XML).

We can also see that the feed format is XML.

This is similar to HTML and is a markup language that is readable by the computer and by people (people who can look past the tags and format).

Apparently this dense text is ‘person’ readable!

Some of it seems ok but most is hard to read.

Let’s make this more readable.

Time is money for the Freelancer.
So let’s copy this data and use a web tool, an XML formatter to explore the RSS XML data.

Here we can see that the XML forms a tree. With 10 indiviual elements, one for each job in this field.

If we look in to the first few items, we can see the information about each job.

Observations

It looks like the feed has 10 elements and each element has the Title, Link to the Job and a Description. The description appears to be an HTML script that contains some of the information that we need.

So let’s scrape that data next!

Acquisition

There are Python packages that we can use to scrape data from such XML feeds.

And it looks like we will be able to extract data from the ‘description’ field too. It appears to be a long string object and we have Python mehtods for strings.

For the RSS feed with will use ‘feedparser’

✅ Note: For simple data-scraping tasks, I like to use a Jupyter notebook. The notebook is useful because it holds the data in memory so it can be explored while we change the code.

This means we don’t need to capture the feed many times.

There is no reason you can’t use VSCode or Pycharm or any other editor.

Again as a freelancer, time is money, so use the tools you are familiar with.

Looking at the feedparser documents

We can learn here that the data is acquired by and then parsed by

import feedparser
data = feedparser.parse(UPWORK_RSS_FEED_URL)

and then we can extract our 3 elements of data using:

import feedparser
data = feedparser.parse(UPWORK_RSS_FEED_URL)

item_title = data.entries[0].title
item_title
item_link = data.entries[0].link
item_link
description  = data.entries[0].description
description

✅ Note: This data will change frequently as the RSS feed is updated with the latest jobs. This is why the RSS feed is so useful to us Finxters.

Extraction

We have already extracted the Title and the Link of the job just from the feedparser entries data.

item_title = data.entries[0].title
item_link = data.entries[0].link

Now we need to extract the individual elements of information from the ‘description‘ string.

Let’s take a closer look at one of the ‘description‘ strings.

description: We are looking for a skilled developer who can create a mobile application and web application for a fitness app. The main feature of the app will be the integration of AI technology to detect the user&#039;s body, diet, and workout plan. The successful candidate will be responsible for designing and developing the app, ensuring it is user-friendly and has a modern, sleek design. The app should be able to track user progress and provide personalized recommendations based on the user&#039;s inputs and body data. Key skills required for this project include: <br /><br />
- Mobile app development <br />
- Web app development <br />
- AI integration <br />
- UX/UI design <br />
- Data analysis and interpretation<br /><br /><b>Hourly Range</b>: $8.00-$10.00
<br /><b>Posted On</b>: December 02, 2023 17:57 UTC<br /><b>Category</b>: Mobile App Development<br /><b>Skills</b>:iOS, Android, Smartphone, Python, Mobile App Development
<br /><b>Skills</b>: iOS, Android, Smartphone, Python, Mobile App Development <br /><b>Country</b>: United States
<br /><a href="https://www.upwork.com/jobs/Fitness-App-Development-with-Functionality_%7E01494dc445d89c9f7f?source=rss">click to apply</a>

Here we see 14 lines of text with HTML markup and tag and characters such as ‘br‘ and &# 039 ;

Then we see a selection of headings inside HTML bold tags.

So the general theme for the description block is:

  • description – HTML code of variable lengths and with some HTML character codes and tags
  • Hourly Rangeb_tags and some text
  • Posted Onb_tags and some text
  • Categoryb_tags and some text
  • Skillsb_tags and 1 or more skills with commas and spaces in between
  • Skills – a repeated line of skills
  • Country b_tags and some text
  • Link – a repeat of the link

Knowing this data structure, we can now use Python to extract the information we need.

Let’s write some code!

First we need to import some packages.

  • feedparser for the RSS feed.
  • pandas for our data storage and filtering
  • ssl to bypass some ssl elements of the feed broadcast.
import feedparser
import pandas as pd
import ssl

Now we need a function to create and return an empty and prepared Dataframe in Pandas.

As we’ve discussed, we need to store each jobs:

  • Title
  • Link
  • Description
  • Posted on
  • Category
  • Skills List
  • Price Type (Hourly Range or Budget
  • Price of budget of max Hourly Rate
  • Country the job originates in
def make_dataframe():

    jobs_df = pd.DataFrame(columns=[
        'Title', 
        'Link',
        'Description', 
        'Posted',
        'Category',
        'Skills',
        'Price Type', 
        'Price', 
        'Country' ])

    return jobs_df

We need a function that steps through the feed and extracts our information.

Firstly, we set up a list of blank data so that if there are gaps in the information, we still have data to place in the DataFrame. Failure to do this would raise an error.

Then we ‘Parse the Feed’

Title and Job link we can get directly from the feed entry.

But for the ‘description‘, we need to use the ‘string.split‘ method and split the string into a list of elements using the ‘bold‘ tag as the separator.

This gives us:

  • description[0] is the first item in the list and is the main description field we just need to stip this of HTML tags Here we use the ‘clean_string‘ function.
  • ‘Posted On’ and ‘Category’ also get cleaned with ‘clean_string‘.

Notice that we slice off only that part we need to send to be ‘cleaned’ eg clean_string(b_tag[15:])

  • ‘Hourly Range’ / Budget’ get special treatment in the ‘clean_price’ function where we return a float for the money value and a string for ‘Budget’ or ‘Hourly Rate’
  • ‘Skills’ needs to be stripped into a list (for searching) and also cleaned.
  • ‘Country’ also needs some special treatment

Once cleaned the data is assigned to a dictionary and added to a DataFrame and added to the master DataFrame.

def get_data(entry):  # entry is a job item from the RSS feed

    # Some data ends up Null so set those values just in case
    item_posted = ''
    item_cat = ''
    item_price_type = ''
    item_price = 0.0
    item_skills = []
    item_country = ''

    # Set from parsing the feed
    item_title = entry.title
    item_link = entry.link
    description  = entry.description
    description = description.split('<b>')
    item_desc = clean_string(description[0])
    for b_tag in description[1:]:
        if "Hourly Range" in b_tag or "Budget" in b_tag:
            item_price_type, item_price = clean_price(b_tag)
        elif "Posted On" in b_tag:
            item_posted = clean_string(b_tag[15:])
        elif "Category" in b_tag:
            item_cat = clean_string(b_tag[14:])
        elif "Skills" in b_tag and not item_skills :
            item_skills = clean_skills(b_tag[11:])
        elif "Country" in b_tag:
            item_country = clean_country(b_tag[10:])




    # build the DataFrame and return it

    new_job = {
    'Title': item_title, 
    'Link': item_link, 
    'Description': item_desc,
    'Posted': item_posted,
    'Category': item_cat, 
    'Skills': item_skills, 
    'Price Type': item_price_type, 
    'Price': item_price, 
    'Country': item_country}

    new_job_df = pd.DataFrame([new_job])

    return new_job_df

The ‘clean_string‘ function uses the ‘replace‘ method and takes each substring that isn’t required and either removes it or replaces it with the correct value.

✅ Note: This is not the most pythonic approach, but it has been written for clarity for beginners in mind. How would you make it more Pythonic?
def clean_string(string):

    string = string.replace('<br />','')
    string = string.replace('</b>','')
    string = string.replace('&nbsp;','')
    string= string.replace('&#039;','\'')
    string = string.replace('&rsquo;','\'')
    string = string.replace('&ldquo;','\"') 
    string = string.replace('&rdquo;','\"') 
    string = string.replace('quot;','\'')
    string = string.strip()

    return string

The ‘clean_price‘ function splits the identifier (‘Hourly Range’ or ‘Budget’) into a new string.

It then extracts the number (also a string) and returns it as a float along wth the identifier.

def clean_price(item_Bud_HR):

    price_split = item_Bud_HR.split(':')
    item_price_type = clean_string(price_split[0]) 
    item_price = price_split[1] # Get and clean the value
    item_price = item_price.replace('$','')
    item_price = item_price.replace('<br />','')
    item_price = item_price.strip()
    if '-' in item_price:
        item_price = item_price.split('-') # If the price is an 'Hourly Range' we split, returning the number on the right of  '-'  
        item_price = item_price[1]
    item_price = float(item_price)

    return item_price_type, item_price

The ‘clean_country‘ function splits the string on '\n'. It then takes the first element, cleans off the white space and returns the Country name.

def clean_country(item_country):

    item_country = item_country.split('\n')
    item_country = clean_string(item_country[0])
    item_country = item_country[1].strip()

    return item_country

The ‘clean_skills‘ function is a little more complex.

We create a new empty list, ‘item_skills_list‘.

We then clean the string by removing HTML tags.

We split the string on the ',' character and step through the list that is created, cleaning each string and then appending it to the list before it is returned.

def clean_skills(item_skills):

    item_skills_list =[]
    item_skills = item_skills.replace('<br />','')
    item_skills = item_skills.split(',')
    for skill in item_skills:
        item_skills_list.append(skill.strip())

    return item_skills_list

Once a new job DataFrame is created for each job, it is ‘concatenated’ to the master DataFrame for later filtering.

def join_dataframes(new_item_df,jobs_df):

    jobs_df = pd.concat([jobs_df, new_item_df], ignore_index=True)

    return jobs_df

The DataFrame jobs_df now holds all of the RSS feed jobs and their associated data. We can now filter it as required.

The ones I have presented here (commented out) offer examples for your own filters.

  1. Strips out any duplicates based on the ‘Posted’ time.
  2. Looks for budgets and hourly figures above 20.0 dollars
  3. Looks for selected countries (United States and India)

What would you want to filter for?

def filter_output(jobs_df):


    # FILTER THE DATA USING Pandas

# 1. Strip out non unique values - Posted on is the pseudo-primary key
    # Uncomment if needed
    #jobs_df = jobs_df.drop_duplicates(subset=['Posted'], ignore_index=True) 

 #2. Only save for Price or budget greater than $10
    # Uncomment if needed
    #jobs_df = jobs_df[jobs_df['Price'] > 20] 

# 3. Only save for Specific Country 
    # selecting rows based on condition 
    # Uncomment below if needed
    #options = ['United States', 'India'] 
    #jobs_df = jobs_df[jobs_df['Country'].isin(options)]

    return jobs_df

Here we have the main() function that takes the Upwork RSS URL and feeds it to the function in turn.

# MAIN
def main():

# Why SSL
# Python is adding http verification in in the std library
# This bypasses the check for th moment
    if hasattr(ssl, '_create_unverified_context'):
        ssl._create_default_https_context = ssl._create_unverified_context


    url="https://www.upwork.com/ab/feed/jobs/rss?q=Python&sort=recency&paging=0%3B10&api_params=1&securityToken=6b9f07dc2632b4ac772d5daa37626af471b7d2526826c56a0c16aad6580245646f4e13804c72bd1ed3755f3bd552f5ba1d3f67a021987f714a1ff340ba7659dc&userUid=1215586676591329280&orgUid=1215586676603912193"

    #Get the Feed Data
    data = feedparser.parse(url)

    # Make the master dataframe
    jobs_df = make_dataframe()

    #Get the data for each item and add it to the DataFrame
    for entry in data.entries:
        new_job_df = get_data(entry)
        # join the new dataframe to the list
        jobs_df = join_dataframes(new_job_df, jobs_df)

Now we have the jobs from the RSS feed in a DataFrame, we can filter using the pandas methods.

    filter_output(jobs_df)
Title Link Description Posted Category Skills Price Type Price Country
0 AWS Python Consultant – Upwork https://www.upwork.com/jobs/AWS-Python-Consult… Fluent English speaking Python developer with … December 02, 2023 20:13 UTC DevOps Engineering [Ubuntu, Amazon Web Services, Python, AWS Lamb… Hourly Range 10.0 United Kingdom
1 1min Time Frame Forex Scalper – Upwork https://www.upwork.com/jobs/1min-Time-Frame-Fo… If you scalp the forex market on the m1 time f… December 02, 2023 20:13 UTC Deep Learning [Forex Trading] Hourly Range 40.0 United Kingdom
2 AI – driven crypto charting project – Upwork https://www.upwork.com/jobs/driven-crypto-char… Scope of work\nThedevelopment of a crypto char… December 02, 2023 20:03 UTC Machine Learning [Artificial Intelligence, Machine Learning, Bl… Hourly Range 40.0 Nigeria
3 Gelato Smart Contract Integration Upgrade – Up… https://www.upwork.com/jobs/Gelato-Smart-Contr… I’m looking for a Solidity developer with Foun… December 02, 2023 20:01 UTC Emerging Tech [Solidity, Blockchain, TypeScript, Ethereum] Hourly Range 40.0 United States
4 Publish Open-source AI Agent to Web UI (Flutte… https://www.upwork.com/jobs/Publish-Open-sourc… The goal of this project is to create a web UI… December 02, 2023 20:01 UTC Full Stack Development [AI Agent Development, AI App Development, Flu… Budget 100.0 Canada
5 Price check automation – Upwork https://www.upwork.com/jobs/Price-check-automa… Would like one of the experts to build me a bo… December 02, 2023 19:57 UTC Scripting &amp; Automation [Automation, Data Scraping, Data Mining, Data … Hourly Range 100.0 Saudi Arabia
6 Need for Good Hackers to Assist in Scamming Si… https://www.upwork.com/jobs/Need-for-Good-Hack… We are looking for good hackers who can assist… December 02, 2023 19:38 UTC Information Security [Data Entry, Python] Hourly Range 45.0 United States
7 Microservices Architecture Help – Upwork https://www.upwork.com/jobs/Microservices-Arch… ### **The Data Synchronization Dilemma**\n—\… December 02, 2023 19:35 UTC Back-End Development [Python, Microservice, Software Architecture &… Hourly Range 40.0 India
8 Build two AVL trees for project – Upwork https://www.upwork.com/jobs/Build-two-AVL-tree… I need an avl tree to hold a string node (key)… December 02, 2023 19:28 UTC Full Stack Development [C++] Budget 250.0 United States
9 ROMP texture on 3D SMPL mesh using Pytorch (No… https://www.upwork.com/jobs/ROMP-texture-SMPL-… (WARNING to SCAMMER)\nStarting from an existin… December 02, 2023 19:23 UTC AR/VR Design [Python, PyTorch, Augmented Reality, Linux, Ub… Budget 300.0 Germany

Once we have filtered the data to meet our needs, we use the pandas method to save the DataFrame to an Excel file.

It also prints out the top 3 entries to demonstrate the data has been captured.

    print(jobs_df.head(3))
    # Export to excel
    jobs_df.to_excel('jobs.xlsx', index=False)
                                          Title  \
0                AWS Python Consultant - Upwork   
1        1min Time Frame Forex Scalper - Upwork   
2  AI - driven crypto charting project - Upwork   

                                                Link  \
0  https://www.upwork.com/jobs/AWS-Python-Consult...   
1  https://www.upwork.com/jobs/1min-Time-Frame-Fo...   
2  https://www.upwork.com/jobs/driven-crypto-char...   

                                         Description  \
0  Fluent English speaking Python developer with ...   
1  If you scalp the forex market on the m1 time f...   
2  Scope of work\nThedevelopment of a crypto char...   

                        Posted            Category  \
0  December 02, 2023 20:13 UTC  DevOps Engineering   
1  December 02, 2023 20:13 UTC       Deep Learning   
2  December 02, 2023 20:03 UTC    Machine Learning   

                                              Skills    Price Type  Price  \
0  [Ubuntu, Amazon Web Services, Python, AWS Lamb...  Hourly Range   10.0   
1                                    [Forex Trading]  Hourly Range   40.0   
2  [Artificial Intelligence, Machine Learning, Bl...  Hourly Range   40.0   

          Country  
0  United Kingdom  
1  United Kingdom  
2         Nigeria  

That completes our exploration of the code.

I hope you have found some insight and value here.

Let us review.

Learning Objectives

This tutorial and video looked at how to read the RSS feed from Upwork, to accelerate your search and allow you to filter the jobs on your preferred criteria.

We have covered:

  • Defining your project requirements
  • Data Scraping
  • RSS feeds and XML (briefly)
  • A potentially useful workflow
  • The building of a useful tool
  • Some useful Python skills

Next Steps

This code is very flexible and so here are some options you might want to consider if you are extending its utility:

  • You may want to run this code on a timer to give you frequent updates.
  • You may also want to load the previous jobs scraped into the jobs_df DataFrame so that you can append new jobs.
  • You may also want to have a ‘list of urls’ for different searches that you step through in order to cover lots of searches
  • If you searches are very specific you might want to have the script email you when a job is posted.

What will you do?

Resources:

  • https://jsonformatter.org/xml-formatter
  • https://ascii.cl/htmlcodes.htm

You can also check out this guide on Google Colab using this link.

Be on the Right Side of Change

Database status card for Laravel Pulse

https://opengraph.githubassets.com/1f9367f276bb74707b68c5b387ff964b9e84a2b12ad55594b2a533de7f130699/maantje/pulse-database

Database status card for Laravel Pulse

Get real-time insights into the status of your database

Example

example

Installation

Install the package using Composer:

composer require maantje/pulse-database

Register the recorder

In your pulse.php configuration file, register the DatabaseRecorder with the desired settings:

return [
    // ...
    
    'recorders' => [
        \Maantje\Pulse\Database\Recorders\DatabaseRecorder::class => [
            'connections' => [
                'mysql_another' => [
                    'values' => [
                        'Connections',
                        'Threads_connected',
                        'Threads_running',
                        'Innodb_buffer_pool_reads',
                        'Innodb_buffer_pool_read_requests',
                        'Innodb_buffer_pool_pages_total',
                        'Max_used_connections'
                    ],
                    'aggregates' => [
                        'avg' => [
                            'Threads_connected',
                            'Threads_running',
                            'Innodb_buffer_pool_reads',
                            'Innodb_buffer_pool_read_requests',
                            'Innodb_buffer_pool_pages_total',
                        ],
                        'max' => [
                            //
                        ],
                        'count' => [
                            //
                        ],
                    ],
                ],
                'mysql' => [
                    'values' => [
                        'Connections',
                        'Threads_connected',
                        'Threads_running',
                        'Innodb_buffer_pool_reads',
                        'Innodb_buffer_pool_read_requests',
                        'Innodb_buffer_pool_pages_total',
                        'Max_used_connections'
                    ],
                    'aggregates' => [
                        'avg' => [
                            'Threads_connected',
                            'Threads_running',
                            'Innodb_buffer_pool_reads',
                            'Innodb_buffer_pool_read_requests',
                            'Innodb_buffer_pool_pages_total',
                        ],
                        'max' => [
                            //
                        ],
                        'count' => [
                            //
                        ],
                    ],
                ]
            ]
        ],
    ]
]

Ensure you’re running the pulse:check command.

Add to your dashboard

Integrate the card into your Pulse dashboard by publish the vendor view.
and then modifying the dashboard.blade.php file:

<x-pulse>
    <livewire:pulse.servers cols="full" />
    
+ <livewire:database cols='6' title="Active threads" :values="['Threads_connected', 'Threads_running']" :graphs="[
+ 'avg' => ['Threads_connected' => '#ffffff', 'Threads_running' => '#3c5dff'],
+ ]" />

+ <livewire:database cols='6' title="Connections" :values="['Connections', 'Max_used_connections']" />

+ <livewire:database cols='full' title="Innodb" :values="['Innodb_buffer_pool_reads', 'Innodb_buffer_pool_read_requests', 'Innodb_buffer_pool_pages_total']" :graphs="[
+ 'avg' => ['Innodb_buffer_pool_reads' => '#ffffff', 'Innodb_buffer_pool_read_requests' => '#3c5dff'],
+ ]" />

    <livewire:pulse.usage cols="4" rows="2" />

    <livewire:pulse.queues cols="4" />

    <livewire:pulse.cache cols="4" />

    <livewire:pulse.slow-queries cols="8" />

    <livewire:pulse.exceptions cols="6" />

    <livewire:pulse.slow-requests cols="6" />

    <livewire:pulse.slow-jobs cols="6" />

    <livewire:pulse.slow-outgoing-requests cols="6" />

</x-pulse>

And that’s it! Enjoy enhanced visibility into your database status on your Pulse dashboard.

Laravel News Links

Doom’s creators reminisce about “as close to a perfect game as anything we made”

https://cdn.arstechnica.net/wp-content/uploads/2023/12/enemycloset-760×380.jpg

The archived hour-long chat is a must-watch for any long-time Doom fan.

While Doom can sometimes feel like an overnight smash success, the seminal first-person shooter was far from the first game created by id co-founders John Carmack and John Romero. Now, in a rare joint interview that was livestreamed during last weekend’s 30th-anniversary celebration, the pair waxed philosophical about how Doom struck a perfect balance between technology and simplicity that they hadn’t been able to capture previously and have struggled to recapture since.

Carmack said that Doom-precursor Wolfenstein 3D, for instance, "was done under these extreme, extraordinary design constraints" because of the technology available at the time. "There just wasn’t that much we could do."

<em>Wolfenstein 3D</em>'s grid-based mapping led to a lot of boring rectangular rooms connected by long corridors.

Wolfenstein 3D‘s grid-based mapping led to a lot of boring rectangular rooms connected by long corridors.

One of the biggest constraints in Wolfenstein 3D was a grid-based mapping system that forced walls to be at 90-degree angles, leading to a lot of large, rectangular rooms connected by long corridors. "Making the levels for the original Wolfenstein had to be the most boring level design job ever because it was so simple," Romero said. "Even [2D platformer Commander Keen] was more rewarding to make levels for."

By the time work started on Doom, Carmack said it was obvious that "the next step in graphics was going to be to get away from block levels." The simple addition of angled walls let Doom hit "this really sweet spot," Carmack said, allowing designers to "create an unlimited number of things" while still making sure that "everybody could draw in this 2D view… Lots of people could make levels in that."

<em>Doom</em>'s support for angled walls and variable heights added a huge amount of design variation while still keeping things relatively simple to edit.

Doom‘s support for angled walls and variable heights added a huge amount of design variation while still keeping things relatively simple to edit.

Romero expanded on the idea, saying that working on top of the Doom engine was, at the time, "the easiest way to make something that looks great. If you want to get anything that looks better than this, you’re talking 10 times the work."

Was Quake too complex?

Then came Quake, with a full 3D design that Carmack admitted was "more ambitious" and "did not reach all of its goals." When it came to modding and designing new levels, Carmack lamented how, with Quake, "a lot of potentially great game designers just hit their limit as far as compositional aesthetic in terms of what something is going to look like."

"When you have the ability to do a full six degree-of-freedom modeling, you not only have to be a game designer, you have to be an architect, a modeler working through your composition," Carmack continued. "[Doom] helped you along by keeping you from wasting time doing some crazy things that you would have had to be a master of a different craft to pull off… Going to full 3D made this something that not everybody does on a lark, but something you set out time, almost set a career arc to make mods for newer games."

<em>Quake</em>'s full 3D maps required much more design skill to build good-looking levels.

Enlarge / Quake‘s full 3D maps required much more design skill to build good-looking levels.

Romero reminisced during the chat about "the Everest of Quake" and "the insane amount of technology we took on" during its development. "Even adding QuakeC on top of [a] client/server [architecture] on top of full 3D, it was so much tech. It was a whole new engine, it wasn’t Doom at all. It was all brand new."

Looking back, Carmack allowed that "there’s a couple different steps we could have taken [with Quake], and we probably did not pick the optimal direction, but we kept wanting to make this, just throw everything at it and say, ‘If you can think of something that’s going to be better, we should just strive our hardest to do that.’ When Doom came together, it was just a perfect storm of ‘everything went right.’.. [it was] as close to a perfect game as anything we made."

Ars Technica – All content

PHP FPM status card for Laravel Pulse

https://opengraph.githubassets.com/2bff59fe8040a87ba2fa7f87c30df23ca5392d7bf7f0411375a4cc4a86b43e15/maantje/pulse-php-fpm

PHP FPM status card for Laravel Pulse

Get real-time insights into the status of your PHP FPM with this convenient card for Laravel Pulse.

Example

Drag Racing

Installation

Install the package using Composer:

composer require maantje/pulse-php-fpm

Enable PHP FPM status path

Configure your PHP FPM status path in your FPM configuration:

Register the recorder

In your pulse.php configuration file, register the PhpFpmRecorder with the desired settings:

return [
    // ...
    
    'recorders' => [
        PhpFpmRecorder::class => [
            // Optionally set a server name gethostname() is the default
            'server_name' => env('PULSE_SERVER_NAME', gethostname()),
            // Optionally set a status path the current value is the default
            'status_path' => 'localhost:9000/status', // with unix socket unix:/var/run/php-fpm/web.sock/status
            // Optionally give datasets, these are the default values.
            // Omitting a dataset or setting the value to false will remove the line from the chart
            // You can also set a color as value that will be used in the chart
            'datasets' => [
                'active processes' => '#9333ea',
                'total processes' => 'rgba(147,51,234,0.5)',
                'idle processes' => '#eab308',
                'listen queue' => '#e11d48',
            ],
        ],
    ]
]

Ensure you’re running the pulse:check command.

Add to your dashboard

Integrate the card into your Pulse dashboard by publish the vendor view.
and then modifying the dashboard.blade.php file:

<x-pulse>
    <livewire:pulse.servers cols="full" />
    
+ <livewire:fpm cols="full" />

    <livewire:pulse.usage cols="4" rows="2" />

    <livewire:pulse.queues cols="4" />

    <livewire:pulse.cache cols="4" />

    <livewire:pulse.slow-queries cols="8" />

    <livewire:pulse.exceptions cols="6" />

    <livewire:pulse.slow-requests cols="6" />

    <livewire:pulse.slow-jobs cols="6" />

    <livewire:pulse.slow-outgoing-requests cols="6" />

</x-pulse>

And that’s it! Enjoy enhanced visibility into your PHP FPM status on your Pulse dashboard.

Laravel News Links