Researchers look a dinosaur in its remarkably preserved face

https://cdn.arstechnica.net/wp-content/uploads/2022/10/DSC_4423B-760×380.jpg

Researchers look a dinosaur in its remarkably preserved face

Royal Tyrrell Museum of Palaeontology

Borealopelta mitchelli found its way back into the sunlight in 2017, millions of years after it had died. This armored dinosaur is so magnificently preserved that we can see what it looked like in life. Almost the entire animal—the skin, the armor that coats its skin, the spikes along its side, most of its body and feet, even its face—survived fossilization. It is, according to Dr. Donald Henderson, curator of dinosaurs at the Royal Tyrrell Museum, a one-in-a-billion find.

Beyond its remarkable preservation, this dinosaur is an important key to understanding aspects of Early Cretaceous ecology, and it shows how this species may have lived within its environment. Since its remains were discovered, scientists have studied its anatomy, its armor, and even what it ate in its last days, uncovering new and unexpected insight into an animal that went extinct approximately 100 million years ago.

Down by the sea

Borealopelta is a nodosaur, a type of four-legged ankylosaur with a straight tail rather than a tail club. Its finding in 2011 in an ancient marine environment was a surprise, as the animal was terrestrial.

A land-based megaherbivore preserved in an ancient seabed is not as uncommon as one might think. A number of other ankylosaurs have been preserved in this manner, albeit not as well as Borealopelta. Scientists suspect its carcass may have been carried from a river to the sea in a flooding event; it may have bobbed at the surface upside-down for a few days before sinking into the ocean depths.

It would have been kept at the surface by what’s referred to as “bloat-and-float,” as the buildup of postmortem gasses would keep it buoyant. Modeling done by Henderson indicates its heavy armor would have rolled it onto its back, a position he suspects may have prevented ocean predators from scavenging its carcass.

Once the gasses that kept it floating were expelled, Borealopelta sank to the ocean floor, landing on its back.

“We can see it went in water deeper than 50 meters because it was preserved with a particular mineral called glauconite, which is a green phosphate mineral. And it only forms in cooler temperatures in water deeper than 50 meters,” explained Dr. Henderson.

He also told Ars that this environment probably also discouraged scavenging, saying, “It was probably a region where [long-necked] plesiosaurs and big fish didn’t like to go. It was too cold and too dark, and [there was] nothing to eat. And there were very few trace fossils in the sediments around it. So there wasn’t much in the way of worms and crustaceans and bivalves and things in there to further digest it. It was just a nice set of conditions in the seabed that had very low biological activity that led to that preservation.”

Unmet expectations

But none of this was known when the animal was discovered. Although it’s not entirely unusual to find dinosaur remains in marine environments, it’s also not very common. Henderson and Darren Tanke, also from the Royal Tyrrell Museum, walked onto the site fully anticipating that they would excavate an ancient marine reptile.

The two had consulted on fossil discoveries at other open-pit mines within the province. However, this was their first visit to Suncor, a mine in the northeast of Alberta, Canada. Everything about this mine is enormous. Massive machinery is constantly in motion, scooping out rock, sand, and gravel from surrounding cliffs, while other equipment clears it away, all with the goal of uncovering the deeper oil sands for fuel.

“It’s just unbelievable, the scale of the place,” Dr. Henderson said. “And it goes 24 hours a day, 365 days a year.”

Despite the pace of operations, one particular shovel operator, Shawn Funk, happened to notice something after taking a big chunk out of the cliff. It was thanks to him and several people within Suncor that operations stopped in that area and the Royal Tyrrell was notified.

Ars Technica – All content

Where to Buy Windows 10 When Microsoft Stops Selling It

https://i.kinja-img.com/gawker-media/image/upload/c_fill,f_auto,fl_progressive,g_center,h_675,pg_1,q_80,w_1200/4b81222da14575a314da89b206b97c14.jpg

Photo: g0d4ather (Shutterstock)

My friends, it’s been a great run—but Microsoft will stop selling Windows 10 on Tuesday, Jan. 31, one week from this article’s publication. The news isn’t necessarily shocking, since the company has been full-steam-ahead with Windows 11 since October 2021. However, it’s still a sad development. Windows 10 is the preferred OS for many PC users who still can’t stomach upgrading.

Of course, Windows 10 isn’t dead. Microsoft will continue to support both Windows 10 Home and Windows 10 Pro until Oct. 14, 2025, giving plenty of us on PC an excuse to keep running the OS until then. If you already have Windows 10 running on your PC, you’re good to go. But if you’re going to build a PC, you’re going to need a new license to install the beloved OS. Here’s where you can get one.

Buy Windows 10 from Microsoft directly (while you can)

As of this article, Microsoft is still selling Windows 10 licenses on its website. You can buy Windows 10 Home for $139, and Windows 10 Pro for $199.99. If you want to buy a legitimate copy of Windows 10 before Microsoft’s end-of-month deadline, now’s the time to do it.

Come Feb. 1, though, you won’t have any luck making purchases on Microsoft’s site. So, where can you turn?

Brick and mortar stores

Just because Microsoft is no longer selling Windows 10 doesn’t mean every other store is pulling the plug. Look to established outlets like Best Buy, Staples, or OfficeDepot for copies of Windows 10. Depending on the store and inventory, you might find a digital download or a physical copy of the software.

G/O Media may get a commission

Look for old PCs with product key stickers

If you have access to an old PC with a product key sticker on the outside, you can use those codes to activate Windows 10 on your current PC. These stickers work all the way back to Windows 7, so it’s a potential solution here.

Be careful of third-party resellers

The first places that pop up when you search for Windows 10 licenses are third-party resellers. These sites have existed for years, and they offer copies of Windows 10 for way less than Microsoft. While the Windows developer charges up to $200 for a copy of Windows 10, sites like Kinguin or PCDestination will sell you a key for anywhere from $25 to $40.

The reason these sites can sell these licenses at such a markdown is because they obtained the software for cheap, one way or another. Perhaps the site was able to buy the key in another country where Microsoft chargers less for Windows 10. Or perhaps the key is stolen. It’s impossible to know for sure.

You can download one of these keys and hope for the best, but, since you don’t really know whether it’s legitimate or not, there’s no telling how your PC will respond. You might activate the key and ride out Windows 10 into 2025. Or Microsoft could come around in a year and tell you the license isn’t legit. Or, worse, you might not be able to activate the key at all.

While the prices might be tempting, the safer bet is to pay full price with one of a reputable store, as I discussed above. Even Amazon’s Windows 10 copies can’t always be trusted. Of course …

You don’t actually need to buy Windows 10 to run Windows 10

If all you want to do is run Windows 10 on your PC without paying a dime, you can totally do that, without resorting to piracy. Windows is a bit unique: Microsoft actually lets you download and install the OS on your machine without paying for it first—if you do so from an ISO. You can install the OS from a flash drive 0r DVD, and, once complete, ignore the pop-ups asking you to activate the software.

Microsoft does place some limitations and annoyances on unactivated versions of Windows 10. You’ll have to deal with a watermark on the screen, and you’ll lose the ability to change themes or wallpapers (although you can set your wallpaper by right-clicking on an image). But you’ll be able to install all updates, and, for the most part, use the OS as if you had a license. You can always activate the software with a product key in the future if you like.

You can download the ISO from Microsoft’s site here.

Lifehacker

Introducing the stethoscope package

https://opengraph.githubassets.com/69929117b2e78bddcd3f169c4cd7dc3c3d0b56c5550547501b43ad976c94aa19/MohsenAbrishami/stethoscope

Stethoscope
For listening to your Laravel app server heartbeat

Features |
Installation |
Usage |
Configuration |
Testing |
Changelog |
Contributing |
Credits |
License


Packagist


license


downloads total


tests


tests

This Laravel package allows you to monitor the infrastructure.

With this package, You can check your server health at any time.

Features

  • monitor cpu usage percentage

  • monitor memory usage percentage

  • monitor hard disk free space

  • check network connection status

  • check nginx status

  • record log when exceeding the consumption CPU, memory, and hard disk of thresholds

  • record log when the network connection fails or Nginx deactivated

Do you need more options? you can make an issue or contributes to the package

Get Started

Requirements

  • PHP 8.0+
  • Laravel 8+
  • Linux Operating System (Debian, Ubuntu, mint, …)

Installation

This package requires PHP 8.0 and Laravel 8.0 or higher.
You can install the package via composer:

composer require mohsenabrishami/stethoscope

and then run:

php artisan vendor:publish --tag=stethoscope

Stethoscope allows you to record reports both in a file and in a database.
If you set the database driver in the config file, you must run migrate command:

Usage

Once installed, see your server health details with a command:

php artisan stethoscope:listen

The output will be like this:

But the work of this package didn’t stop there. you can set thresholds for CPU, memory and hard disk consumption. if CPU and memory consumption exceeds thresholds or hard disk free space is less than thresholds, then a log is created from details consumption. also, you can config this package so that if the deactivated web server or disconnected internet log is created. To start monitoring your server, just run this command:

php artisan stethoscope:monitor

You can monitor your server constantly with the run this command by a cron job.
You may want to be notified if there is a problem in the server. For this, it is enough to set your email admin in the config file.

If you are worried about the increase in logs, use the following command. This command deletes old logs based on the number of days you defined in the config file.

php artisan stethoscope:clean

Configuration

You can easily customize this package in the config/stethoscope.php.

In this file, You can configure the following:

  • Resources that should be monitored. We can monitor the CPU, memory, hard disk, network connection, and web server status.

  • Web server that is installed on your server. We support Nginx and apache.

  • Storage driver and path to saving log files.

  • Resources Thresholds. Include maximum CPU and memory usage and minimum hard disk space.

  • Custom network URL for network connection monitor.

  • Driver to save resource logs (support file storage and database).

  • Emails address to send notification emails when your server has problems.

  • Number of days for which resource logs must be kept.

By default, the configuration looks like this:

    /*
    |--------------------------------------------------------------------------
    | Monitorable Resources
    |--------------------------------------------------------------------------
    | Here you can Define which resources should be monitored.
    | Set true if you want a resource to be monitored, otherwise false.
    |
    */

    'monitorable_resources' => [
        'cpu' => true,
        'memory' => true,
        'hard_disk' => true,
        'network' => true,
        'web_server' => true,
    ],

    /*
    |--------------------------------------------------------------------------
    | Web Server Name
    |--------------------------------------------------------------------------
    | Here you can define what web server installed on your server.
    | Set `nginx` or `apache`
    |
    */

    'web_server_name' => 'nginx',

    /*
    |--------------------------------------------------------------------------
    | Log File Storage
    |--------------------------------------------------------------------------
    | Define storage driver and path for save log file.
    |
    */

    'log_file_storage' => [
        'driver' => 'local',
        'path' => 'stethoscope/',
    ],

    /*
    |--------------------------------------------------------------------------
    | Thresholds
    |--------------------------------------------------------------------------
    | If resource consumption exceeds these thresholds, a log will be created.
    | You may define maximum CPU and memory usage by percent.
    | You may define minimum hard disk space by byte.
    */

    'thresholds' => [

        'cpu' => env('CPU_MONITOR_THRESHOLD', 90),

        'memory' => env('MEMORY_MONITOR_THRESHOLD', 80),

        'hard_disk' => env('HARD_DISK_MONITOR_THRESHOLD', 5368709),

    ],

    /*
    |--------------------------------------------------------------------------
    | Network Monitor URL
    |--------------------------------------------------------------------------
    | Here you can define the desired URL for network monitoring.
    |
    */

    'network_monitor_url' => env('NETWORK_MONITOR_URL', 'https://www.google.com'),

    /*
    |--------------------------------------------------------------------------
    | Log Record Driver
    |--------------------------------------------------------------------------
    | Set `database` for save logs in database and `file` for record logs in file
    |
    */

    'drivers' => [
        'log_record' => env('STETHOSCOPE_LOG_DRIVER'  ,'file')
    ]

    /*
    |
    | You can get notified when specific events occur. you should set an email to get notifications here.
    |
    */

    'notifications' => [
        'mail' => [
            'to' => null,
        ],
    ],

    /*
    |
    | Here you define the number of days for which resource logs must be kept.
    | Older resource logs will be removed.
    |
    */

    'cleanup_resource_logs' => 7

Testing

Run the tests with:

Changelog

Please see CHANGELOG for more information on what has changed recently.

Contributing

Please see CONTRIBUTING for details.

Credits

License

The MIT License (MIT). Please see License File for more information.

Laravel News Links

Crowder sets the record straight on Tim Pool’s podcast

https://www.louderwithcrowder.com/media-library/image.jpg?id=32892647&width=980

With all that has been said about the boss, about this company, and most importantly, ABOUT YOU, Crowder is sitting down with Tim Pool. They’ll be talking about #StopBigCon, the Daily Wire, and ALL of the issues plaguing the conservative media landscape today.


Timcast IRL – Steven Crowder Joins To Discuss StopBigCon.Com Live At 8PM EST

www.youtube.com

Louder With Crowder

10 Things I Look for When I Audit Laravel Codebases


Oftentimes one of the first steps along my journey of working with a new company involves some kind of review of their code in the form of a code audit. How long the process takes depends on how much code they have and what I find. A longer audit should be more thorough, but a shorter audit should still be helpful. I put together this list of 10 things I would typically look for in an initial 2-3 hour audit of a Laravel project.

In this particular scenario I don’t assume that I have any kind of access to any of the company’s environments or the ability to consult with the company’s past or present developers. All I have is the code they’ve given me.

The goal of the audit is not to point fingers or tell people what they’ve done wrong, but to familiarize myself with the way they do things and to better understand the state their codebase is in. Once I have completed this process I can better advise them of potential improvements that could be made. The audit doesn’t necessitate setting up a dev environment. Although that may be helpful, the code should speak for itself. Later, as a follow up, I can set up a dev environment to test my assumptions and determine whether the code actually does what it appears to do.

One of the first things I look at when examining a company’s codebase is whether or not they are using version control. If they send me a copy of their code via email or a file sharing program, I already know that the answer is probably no. Without version control I don’t know who made what changes when and I don’t have a good way to allow multiple developers to work on the project at the same time. Ideally the client would have their project on Github because this is what I am most familiar with, but other solutions like Bitbucket and Gitlab would suffice. Version control also opens up the door to continuous integration and delivery which are both good strategies for an organization to adopt.

As a developer, the README file is where I go first when I want to know about a project on Github. It’s here that I look for instructions about installing the project in a local dev environment, what kind of requirements it may have, what the code does at a high level, and maybe some examples of how it is used. If the project has documentation, I would expect to find a link to that documentation here. However, if the default Laravel README is present, that’s not a good sign. It means the developer hasn’t taken the time to document their code in the most obvious and accessible place to do so. It may also mean that I am on my own when it comes to discovering any caveats involved in setting up my own development environment.

The composer.json and composer.lock files contain information about the project’s core dependencies. Once I’ve confirmed that these files exist, I check which version of Laravel the project is using. If I see that it’s using an older version of Laravel such as version 5.6 which was released in 2018, I might ask why they’ve avoided upgrading. If a major version upgrade is too much, why not at least upgrade to the lastest minor version which would be 5.8?

Having the latest version makes it easier to work with other software packages and take advantage of new features which save developers time and improve performance. If the company is not able or willing to make the investment to upgrade to the latest major version, they should at least keep up with the latest minor version which will occasionally eliminate newly-discovered security vulnerabilities and provide minor improvements. They should also be aware of when their version reaches end-of-life and will no longer be officially supported with security fixes.

In the fourth step I look at the routes in the routes directory. I want to see that routes are using middleware, groups, and controllers appropriately. This also gives me a feel for the scope of the project. Here I should be able to see every URL accessible by a user and every API endpoint.

I then will look for instances where the code interacts with the database to perform a select, insert, update, or delete operation. Ideally most queries and commands would be done through the appropriate model using Eloquent instead of the the database facade (DB) and the query builder. The models should contain fillable fields and methods relating them to other models.

I also look for queries that are manually constructed from strings which may introduce opportunities for SQL injection which could compromise the database. I don’t analyze every query individually, but if I see an obvious “N+1” problem (poorly optimized queries that lead to an excessive number of requests to the database) I will make a note of it.

For the sixth step in my audit I check to see that they are using migrations. Migrations are a valuable tool included with Laravel and allow the developer to codify schema changes and reliably apply them in production and development environments. Since each migration is a file with a timestamp in the filename it’s easy to see when the first and last migrations were created. If they are using migrations, then I check the tables in the migrations against their models. Are there any models without tables? Are there any tables without models?

I also take this opportunity to examine the structure of individual tables including column data types. I look for a consistent naming convention for both models and tables – one that takes advantage of Laravel’s automatic model-to-table mapping feature is preferrable. If you have, for example, a customer_orders table, the corresponding model would be CustomerOrder.

I like to see that the project provides a way for a developer to seed a new database for development or testing purposes. This can be a valuable resource and provides a much safer way of populating the database than importing production data. Laravel provides ways to seed database via Database Seeders and Factories. Seeders can also be used when setting up the initial production environment database and can indicate what data is required as a minimum for the app to run.

For this step I want to see how the frontend is rendered. This can vary widely between projects. If the project is using Blade templates, I look to see if any templates include logic that they shouldn’t and whether they are using layouts. I’ll scan the package.json file for anything interesting.

I mainly want to know how the frontend is connected to the backend. Is the app rendered server-side or client-side? If they’re using Vue or Inertia, I might spend more time here.

I don’t go into depth here I just want to see if there are tests and what testing library they are using. I might examine a few of the tests to see what the assertions are and get a feel for the level of coverage.

The last thing I do is look for examples of changes I could make to improve readability and maintainability. I do this by scanning through PHP files where I’m most likely to find business logic and database interactions. I want to find examples of code that is difficult to understand and repetitive. Some things I look for include variable and function names that are ambiguous or misleading, methods that are never called, and snippets of code that appear multiple times as if they were copied and pasted. I may also look for inconsistencies in syntax and use of indentation. From this I can kind of guess which if any coding standards the developers adhere to.

After performing this type of audit I should have enough information to take back to the company to advise them on what my next steps would be depending on their goals. The next logical step may involve setting up my own dev environment and testing the core features of the app as a user.

With my environment set up I should be able to validate migrations and run tests. I could also go through my notes from the first audit and test any assumptions I had.

Laravel News Links

This Hall Effect Stick Upgrade Kit Will Solve Joy-Con Drift Forever

https://i.kinja-img.com/gawker-media/image/upload/c_fill,f_auto,fl_progressive,g_center,h_675,pg_1,q_80,w_1200/acb50d93703de9d9a8d20785286af5ba.jpg

GuliKit, makers of the truly excellent KingKong Pro 2 wireless controller for the Nintendo Switch, which we reviewed last year, has just released an upgrade/repair kit for the official Nintendo Joy-Cons that brings its drift-free Hall effect joysticks to the handheld console’s native controllers.

It’s an unfortunate fact that if you own a Nintendo Switch and play it regularly, you’ve possibly already experienced the issue known as ‘Joy-Con drift’ where the Switch detects joystick inputs even when a player’s fingers aren’t touching them at all. The most likely cause is related to components in modern controllers called potentiometers that physically wear down over time with prolonged use.

The issue can make games unplayable, and the only real solutions are to try to convince Nintendo to repair your Joy-Cons or repair them yourself. Even so, both are only a temporary fix when the same problematic joystick hardware is being used as a replacement.

The biggest selling point behind GuliKit’s KingKong Pro 2 wireless controller was that it uses upgraded joysticks that rely on Hall effect sensing, where magnets and magnetic sensors detect even the subtlest movements of the sticks. This eliminates moving parts rubbing against each other and wearing down over time, potentially eliminating joystick drift forever. On certain platforms, if the software is there to support it, you can also use Hall effect sticks to eliminate dead zones or customize how large your stick’s input radius is.

The KingKong Pro 2 was a workaround for Joy-Con drift, however, not a solution. Now, GuliKit has made that controller’s Hall effect joysticks available as a drop-in replacement/upgrade for the joystick hardware that still ships inside the Joy-Cons.

G/O Media may get a commission

Up to $100 credit

Samsung Reserve

Reserve the next gen Samsung device
All you need to do is sign up with your email and boom: credit for your preorder on a new Samsung device.

It looks like you can get a pair of them on Amazon for about $30 right now, or a four-pack for $53 if you want to help a friend and save a few bucks in the process. But while GuliKit promises these are a “100% fitable, perfect replacement, drop in with no hassles” fix, swapping out the sticks in your Joy-Cons isn’t as easy as swapping out AA batteries in a radio.

JoyCon Drift Fix! How to Replace the Nintendo Switch Left Joy-Con Joystick

iFixit has shared a video on YouTube of the process of swapping out a Joy-Con’s joystick, and while it’s relatively straightforward, it will definitely help to have the right tools on hand, including tweezers for manipulating ribbon cables, and a special tri-point screwdriver for dealing with the non-standard screws Nintendo loves to use. It goes without saying that an upgrade/repair like this will definitely void your Joy-Cons’ warranties, and probably the Switch’s too, but if you’re suffering from Joy-Con drift with no solution available, this seems like the best way to go right now.

Gizmodo

Steven Crowder and the Daily Wire are publicly beefing about a $50M contract

https://media.notthebee.com/articles/63c9666c7b6d763c9666c7b6d8.jpg

After proclaiming that he was "done being quiet," Steven Crowder went on a tirade on his show yesterday against an unnamed "Big Con" conservative establishment that offered him a contract. As you’ll see below, the Big Con company is the Daily Wire, and the contract was worth $50 million over 4 years.

Not the Bee

Why MySQL Could Be Slow With Large Tables

https://www.percona.com/blog/wp-content/uploads/2023/01/Why-MySQL-Could-Be-Slow-With-Large-Tables-300×168.jpgWhy MySQL Could Be Slow With Large Tables

Why MySQL Could Be Slow With Large Tables16 years ago, our founder Peter Zaitsev covered this topic and some of the points described there are still valid, and we will cover more on this blog. While the technologies have evolved and matured enough, there are still some people thinking that MySQL is only for small projects or that it can’t perform well with large tables.

Some startups adopted MySQL in its early days such as Facebook, Uber, Pinterest, and many more, which are now big and successful companies that prove that MySQL can run on large databases and on heavily used sites.

With disks being faster nowadays and CPU and memory resources being cheaper, we could easily say MySQL can handle TBs of data with good performance. For instance, in Percona Managed Services, we have many clients with TBs worth of data that are well performant.

In this blog post, we will review key topics to consider for managing large datasets more efficiently in MySQL.

Primary keys:

This is one of the most important things to consider when creating a new table in MySQL, we should always create an explicit primary key (PK). InnoDB will sort the data in primary key order, and that will serve to reference actual data pages on disk. If we don’t specify a primary key, MySQL will check for other unique indexes as candidates for PK, and if there are none, it will create an internal clustered index to serve as the primary key, which is not the most optimal.

When there is no application logic or candidate to choose as a primary key, we can use an auto_increment column as the primary key. 

NOTE: As of MySQL 8.0.30, Generated Invisible Primary Keys were introduced to add an invisible primary key when no explicit PK is defined. You can refer to the documentation for further details.

Also, keep in mind that a portion of the primary key will be added at the end of each secondary index, so try to avoid selecting strings as the primary key, as it will make the secondary indexes to be larger and the performance will not be optimal. 

Redundant indexes:

It is known that accessing rows by fetching an index is more efficient than through a table scan in most cases. However, there are cases where the same column is defined on multiple indexes in order to serve different query patterns, and sometimes some of the indexes created for the same column are redundant, leading to more overhead when inserting or deleting data (as indexes are updated) and increased disk space for storing the indexes for the table.

You can use one of our tools pt-duplicate-key-checker to detect the duplicate keys.

Example (using the employee sample DB):

Suppose we have the following schema:

db1 employees> show create table employees\G
*************************** 1. row ***************************
       Table: employees
Create Table: CREATE TABLE `employees` (
  `emp_no` int NOT NULL,
  `birth_date` date NOT NULL,
  `first_name` varchar(14) NOT NULL,
  `last_name` varchar(16) NOT NULL,
  `gender` enum('M','F') NOT NULL,
  `hire_date` date NOT NULL,
  PRIMARY KEY (`emp_no`),
  KEY `idx_last_name` (`last_name`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci

Now, suppose that we need to filter by last_name and hire_date, we would create the following index:

ALTER TABLE employees ADD INDEX idx_last_name_hire_date (last_name,hire_date);

We would end up with the following schema:

db1 employees> show create table employees\G
*************************** 1. row ***************************
       Table: employees
Create Table: CREATE TABLE `employees` (
  `emp_no` int NOT NULL,
  `birth_date` date NOT NULL,
  `first_name` varchar(14) NOT NULL,
  `last_name` varchar(16) NOT NULL,
  `gender` enum('M','F') NOT NULL,
  `hire_date` date NOT NULL,
  PRIMARY KEY (`emp_no`),
  KEY `idx_last_name` (`last_name`),
  KEY `idx_last_name_hire_date` (`last_name`,`hire_date`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci

Now, the index idx_last_name and idx_last_name_hire_date have the same prefix (last_name).

The new index idx_last_name_hire_date can be used to serve queries filtered by only last_name or, by last_name and hire_date, leaving the last_name index redundant.

We can corroborate that by using pt-duplicate-key-checker:

[user1] percona@db1: ~ $ pt-duplicate-key-checker -d employees
# ########################################################################
# employees.employees                                                     
# ########################################################################


# idx_last_name is a left-prefix of idx_last_name_hire_date
# Key definitions:
#   KEY `idx_last_name` (`last_name`),
#   KEY `idx_last_name_hire_date` (`last_name`,`hire_date`)
# Column types:
#   `last_name` varchar(16) not null
#   `hire_date` date not null
# To remove this duplicate index, execute:
ALTER TABLE `employees`.`employees` DROP INDEX `idx_last_name`;


# ########################################################################
# Summary of indexes                                                      
# ########################################################################

# Size Duplicate Indexes   350357634
# Total Duplicate Indexes  1
# Total Indexes            17

Data types:

It’s not uncommon to find databases where the data type is not fitted correctly. There are many cases where there are int fields where data could fit in a smallint field or fixed-sized char fields that could be stored in a variable-sized varchar field. This may not be a huge problem for small tables, but for tables with millions of records, overprovisioning data types will only make the table to be bigger in size and performance, not the most optimal. 

Make sure you design the data types correctly while planning for the future growth of the table.

Example:

Creating four simple tables to store strings but using different data types:

db1 test> CREATE TABLE tb1 (id int auto_increment primary key, test_text char(200)); 
Query OK, 0 rows affected (0.11 sec)

db1 test> CREATE TABLE tb2 (id int auto_increment primary key, test_text varchar(200)); 
Query OK, 0 rows affected (0.05 sec)

db1 test> CREATE TABLE tb3 (id int auto_increment primary key, test_text tinytext); 
Query OK, 0 rows affected (0.13 sec)

db1 test> CREATE TABLE tb4 (id int auto_increment primary key, test_text text); 
Query OK, 0 rows affected (0.11 sec)

Inserting 2,000 rows with text:

[user1] percona@db1: ~ $ for i in {1..2000}; do for tb in {1..4}; do mysql test -e "INSERT INTO tb$tb (test_text) VALUES ('Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse euismod, nulla sit amet rhoncus venenatis, massa dolor lobortis nisi, in.');"; done; done

All four tables have 2,000 rows:

[user1] percona@db1: ~ $ mysql test -e "select count(*) from tb1; select count(*) from tb2; select count(*) from tb3; select count(*) from tb4;"
+----------+
| count(*) |
+----------+
|     2000 |
+----------+
+----------+
| count(*) |
+----------+
|     2000 |
+----------+
+----------+
| count(*) |
+----------+
|     2000 |
+----------+
+----------+
| count(*) |
+----------+
|     2000 |
+----------+

Let’s look at the disk space usage for the tables:

[user1] percona@db1: ~ $ sudo ls -lh /var/lib/mysql/test/|grep tb
-rw-r-----. 1 mysql mysql 592K Dec 30 02:48 tb1.ibd
-rw-r-----. 1 mysql mysql 464K Dec 30 02:48 tb2.ibd
-rw-r-----. 1 mysql mysql 464K Dec 30 02:48 tb3.ibd
-rw-r-----. 1 mysql mysql 464K Dec 30 02:48 tb4.ibd

We can see that tb1 is larger than the others, as it is storing the text on a fixed size char (200) field that will store the defined 200 characters without caring about the actual inserted string length, while the varchar, tinytext, and text fields are variable sized fields and will store only the actual length of the string (in the example we inserted 143 characters).

Compression:

Compression is the process of restructuring the data by changing its encoding in order to store it in fewer bytes. There are many compression tools and algorithms for data out there. 

MySQL supports native compression for InnoDB tables using the Zlib library with the LZ77 compression algorithm. It allows for saving disk space and data in memory at the expense of CPU usage for compressing and decompressing the data. If CPU usage is not a bottleneck in your setup, you can leverage compression as it can improve performance which means that less data needs to be read from disk and written to memory, and indexes are compressed too. It can help us to save costs on storage and backup times.

The compression ratio depends on multiple factors, but as with any other compression method, it is more efficient on text than on binaries, so tables with text fields will have a better compression ratio. 

Example (using the employee sample DB):

Created a new table employees_compressed:

mysql> CREATE TABLE employees_compressed LIKE employees;
Query OK, 0 rows affected (0.12 sec)

mysql> ALTER TABLE employees_compressed ROW_FORMAT=COMPRESSED;
Query OK, 0 rows affected (0.14 sec)
Records: 0  Duplicates: 0  Warnings: 0
mysql> INSERT INTO employees_compressed SELECT * FROM employees;

Size comparison:

[user1] percona@db1: ~ $ sudo ls -lh /var/lib/mysql/employees/|grep employees
-rw-r-----. 1 mysql mysql 704M Dec 30 02:28 employees.ibd
-rw-r-----. 1 mysql mysql 392M Dec 30 17:19 employees_compressed.ibd

In this simple example, we had a compression ratio of ~45%!

There are a couple of blog posts from Yves that describe and benchmark MySQL compression:

Compression Options in MySQL (Part 1)
Compression Options in MySQL (Part 2)

Archive or purge old or non-used data:

Some companies have to retain data for multiple years either for compliance or for business requirements. However, there are many cases where data is stored and needed only for a short time; for example, why keep application session information for many years?

While MySQL can handle large data sets, it is always recommended to keep only the used data in the databases, as this will make data access more efficient, and also will help to save costs on storage and backups. There is a good blog post from Gaurav, MySQL Data Archival With Minimal Disruption,  showing how we can easily archive old data using pt-archiver.

Partitioning:

Partitioning is a feature that allows dividing a large table into smaller sub-tables based on a partition key. The most common use case for table partitioning is to divide the data by date.

For example: Partitioning a table by year, can be beneficial if you have data for many years and your query patterns are filtered by year. In this case, it would be more efficient to read only one smaller partition rather than one large table with information from many years.

It is very important to analyze the partition key before partitioning based on query patterns because if queries do not always use the partition key as a filtering condition, they will need to scan one or multiple partitions to get the desired data, which results in a huge performance penalty. 

It is a cool feature but as mentioned above, it is not suitable for every workload and it needs to be planned carefully, as choosing a poor partition key can result in huge performance penalties.

Sharding:

Sharding is the concept of splitting data horizontally, i.e. by distributing data into multiple servers (shards), meaning that the different portions of data for a given table, may be stored on many different servers. This can help to split large data sets into smaller ones stored in multiple servers.

The data is split in a similar way to partitioning, using a sharding key, which is the pattern of how the data is split and distributed among the shards. This needs to be handled at the application layer, and have a coordinator that reads the query and distributes the query to the specific shard where the data is stored.

Also, it is important to carefully select the appropriate sharding key depending on the query patterns to the table in order to solve the majority of queries by routing only to one shard, as having to look for the information from many shards and then filter it, process and aggregating it is an expensive operation.

From the above, not all applications or workloads may be fitted for sharding, and, adding that it requires to be properly handled to the application, it may add complexity to the environments.

MongoDB supports this natively, however, MySQL doesn’t, but there are some efforts in the MySQL world to implement sharding. 

Some of them are:

  • MySQL Cluster: 

MySQL NDB Cluster is an in-memory database clustering solution developed by Oracle for MySQL. It supports native sharding being transparent for the application. It is available under a paid subscription.

  • ProxySQL:

It is a feature-rich open-source MySQL proxy solution, that allows query routing for the most common MySQL architectures (PXC/Galera, Replication, Group Replication, etc.). 

It allows sharding by configuring a set of backend servers (shards) and a set of query rules, to route the application queries to the specified shards.

Note that it requires some handling on the application as it doesn’t support the merging and data retrieval from multiple shards.

You can find more information in this good blog from Marco: MySQL Sharding with ProxySQL

  • Vitess:

It is an open source database clustering solution created by PlanetScale that is compatible with the MySQL engine. It supports native sharding. You can find more information about Vitess on our blog post from Alkin: Introduction to Vitess on Kubernetes for MySQL – Part I of III.

MyRocks:

MyRocks is a storage engine developed by Facebook and made open source. It was developed for optimizing data storage and access for big data sets. MyRocks is shipped in Percona Server for MySQL

There is a cool blog post from Vadim covering big data sets in MyRocks:

MyRocks Use Case: Big Dataset

Query tuning:

It is common to find applications that at the beginning perform very well, but as data grows the performance starts to decrease. The most common cause is that poorly written queries or poor schema design are well-performant with minimum data, however, as data grows all those problems are uncovered. You can make use of the slow_query_log and pt-query-digest to find your problematic queries.

Administration:

Performing administrative tasks on large tables can be painful, specifically schema changes and backups.

For schema changes, Percona has a tool pt-online-schema-change that can help us to perform schema changes with minimal disruption to the database. It works by creating a new table with the desired schema change applied, and it copies existing data in batches from the original table to the new one. Ongoing changes are copied from the original table to the new one using triggers.

This way, in a large table, instead of having a huge blocking operation for an alter, pt-OSC can run in the background without locking and in batches to minimize performance impact. 

For backups of large datasets, Percona XtraBackup can help to reduce the time to backup and recovery, it is a hot physical backup solution that copies the data files of the tables while saving the ongoing changes to the database as redo logs. It supports native compression and encryption.

Remember that monitoring your databases is always important to help you find issues or bottlenecks, you can use and install for free Percona Monitoring and Management to take a deeper look at your servers’ and databases’ health. It provides QAN (Query Analyzer) to help you find problematic queries in your databases. 

Conclusion

The old myth that MySQL can’t handle large datasets is nothing but a myth. With hardware being more powerful and cheaper, and the technology evolving, now it is easier than ever to manage large tables in MySQL. 

Percona Database Performance Blog

It’s time to stop…

https://www.louderwithcrowder.com/media-library/image.png?id=32860182&width=980

Conservative media giants are no better than Big Tech. The people you thought were fighting for you have been putting quick profits and the appeasement of their tech overlords ahead of any real conservative values. I can’t continue if this does. It’s time we put a stop to the Big Con.


It’s time to stop…

www.youtube.com

Louder With Crowder