Invention network company Xinova to shut down, 5 years after spinning out of Intellectual Ventures

https://cdn.geekwire.com/wp-content/uploads/2021/08/1516157988441.jpeg

Xinova founder Edward Jung. (LinkedIn Photo)

Xinova, a Seattle company that operated a network of inventors, is winding down operations, GeekWire has learned.

Xinova spun out of Intellectual Ventures in 2016 and helped match inventor ideas with customers such as PepsiCo, Honda, Funai, and others. It billed itself as an “innovation-as-a-service” platform.

The company raised a $48 million Series A investment to scale the business and grow its network that included more than 12,000 inventors across 118 countries. Xinova would find companies that needed tech solutions; put a call out to its inventor network; and compensate inventors via upfront payment and profit-sharing agreements. Xinova would typically cover development costs and manage intellectual property. The company at one point employed more than 100 people globally across ten offices worldwide.

But Xinova needed more cash and ran into trouble when it tried raising another $100 million. The U.S.-China trade war and the pandemic caused investor wariness, and one firm even backed out of a signed funding commitment.

The company had to trim expenses, laying off staff over the past two years and cutting costs elsewhere. It again tried to raise more capital earlier this year but wasn’t able to convince investors. Xinova tried to work with its creditors on a restructuring plan but that also didn’t pan out, forcing the shutdown decision. The company now has only two employees who are helping wind down operations.

Edward Jung, who helped launch invention business Intellectual Ventures in 2000 and was an early Microsoft engineering leader, was Xinova’s founder and CEO. He said the company was able to run an “economically-sustainable business” but couldn’t renegotiate its legacy liabilities, which made additional equity fundraising impossible.

Jung resigned several months ago, citing a conflict with his position as the largest secured creditor. He said he is responsible for “taking on liabilities and risks in excess to what was sustainable through these unexpected changes.”

“I regret disappointing my most excellent investors, partners, and employees,” Jung added. “But I remain a believer in the vision of global innovation networks and am continuing this vision in new stealth projects.”

Jung launched Intellectual Ventures with former Microsoft technology chief Nathan Myhrvold. In 2007, Jung began heading up the firm’s “Invention Development Fund,” which was spun out in 2016 as Xinova.

Jorma Ollila, former chairman of Royal Dutch Shell and former chairman and CEO of Nokia, was Xinova’s executive chairman. Paul Levins, Xinova co-founder and former chief strategy officer, left in 2019 and is currently helping restaurant tech startup Souszen and agtech startup Beanstalk.

Xinova spun out its own standalone company called Allied Inventors in 2017 to help manage intellectual property assets. We’ve reached out to Allied to learn if it is still in business. Tom Kang, who previously led Xinova, is CEO of Allied, according to the company’s website.

Xinova in 2019 launched a joint venture called Arcnet, an online capital marketplace that lets investors fund innovation projects on one platform across borders.

Xinova Asia will continue to operate. Xinova also had offices in Japan, Finland, Tel Aviv, Vienna, Beijing, Singapore, Sydney, Hong Kong, and Bangalore.

GeekWire

7 Algorithms Every Programmer Should Know

https://static1.makeuseofimages.com/wordpress/wp-content/uploads/2021/08/dijkstra-algo.jpg

As a student of programming, you’ve likely learned plenty of different algorithms throughout the course of your career. Becoming proficient in different algorithms is absolutely essential for any programmer.

With so many algorithms, it can be challenging to keep track of what’s essential. If you’re prepping for an interview or simply brushing up on your skills, this list will make it relatively easy. Read on as we list the most essential algorithms for programmers.

1. Dijkstra’s Algorithm

Edsger Dijkstra was one of the most influential computer scientists of his time, and he contributed to many different areas of computing science, including operating systems, compiler construction, and much more. One of Dijkstra’s most notable contributions is the ingenuity of his shortest path algorithm for graphs, also known as Dijkstra’s Shortest Path Algorithm.

Dijkstra’s algorithm finds the single shortest path in a graph from a source to all graph vertices. On every iteration of the algorithm, a vertex is added with the minimum distance from the source and one that does not exist in the current shortest path. This is the greedy property used by Djikstra’s algorithm.


The algorithm is typically implemented using a set. Dijkstra’s algorithm is very efficient when implemented with a Min Heap; you can find the shortest path in just O(V+ElogV) time (V is the number of vertices and E is the number of edges in a given graph).

Dijkstra’s algorithm has its limitations; it only works on directed and undirected graphs with edges of positive weight. For negative weights, the Bellman-Ford algorithm is typically preferable.

Interview questions commonly include Djikstra’s algorithm, so we highly recommend understanding its intricate details and applications.

2. Merge Sort

We’ve got a couple of sorting algorithms on this list, and merge sort is one of the most important algorithms. It’s an efficient sorting algorithm based on the Divide and Conquer programming technique. In a worst-case scenario, merge sort can sort “n” numbers in just O(nlogn) time. Compared to primitive sorting techniques such as Bubble Sort (that takes O(n^2) time), merge sort is excellently efficient.

Related: Introduction to Merge Sort Algorithm

In merge sort, the array to be sorted is repeatedly broken down into subarrays until each subarray consists of a single number. The recursive algorithm then repeatedly merges the subarrays and sorts the array.

3. Quicksort

Quicksort is another sorting algorithm based on the Divide and Conquer programming technique. In this algorithm, an element is first chosen as the pivot, and the entire array is then partitioned around this pivot.

As you’ve probably guessed, a good pivot is crucial for an efficient sort. The pivot can either be a random element, the media element, the first element, or even the last element.

Implementations of quicksort often differ in the way they choose a pivot. In the average case, quicksort will sort a large array with a good pivot in just O(nlogn) time.

The general pseudocode of quicksort repeatedly partitions the array on the pivot and positions it in the correct position of the subarray. It also places the elements smaller than the pivot to its left and elements greater than the pivot to its right.

4. Depth First Search

Depth First Search (DFS) is one of the first graph algorithms taught to students. DFS is an efficient algorithm used to traverse or search a graph. It can also be modified to be used in tree traversal.


The DFS traversal can start from any arbitrary node, and it dives into each adjacent vertex. The algorithm backtracks when there is no unvisited vertex, or there’s a dead-end. DFS is typically implemented with a stack and a boolean array to keep track of the visited nodes. DFS is simple to implement and exceptionally efficient; it works(V+E), where V is the number of vertices and E is the number of edges.

Typical applications of the DFS traversal include topological sort, detecting cycles in a graph, pathfinding, and finding strongly connected components.

5. Breadth-First Search

Breadth-First Search (BFS) is also known as a level order traversal for trees. BFS works in O(V+E) similar to a DFS algorithm. However, BFS uses a queue instead of the stack. DFS dives into the graph, whereas BFS traverses the graph breadthwise.

The BFS algorithm utilizes a queue to keep track of the vertices. Unvisited adjacent vertices are visited, marked, and queued. If the vertex doesn’t have any adjacent vertice, then a vertice is removed from the queue and explored.

BFS is commonly used in peer-to-peer networks, shortest path of an unweighted graph, and to find the minimum spanning tree.

6. Binary Search

Binary Search is a simple algorithm to find the required element in a sorted array. It works by repeatedly dividing the array in half. If the required element is smaller than the middlemost element, then the left side of the middle element is processed further; otherwise, the right side is halved and searched again. The process is repeated until the required element is found.

The worst-case time complexity of binary search is O(logn) which makes it very efficient at searching linear arrays.

7. Minimum Spanning Tree Algorithms

A minimum spanning tree (MST) of a graph has the minimum cost among all possible spanning trees. The cost of a spanning tree depends on the weight of its edges. It’s important to note that there can be more than one minimum spanning tree. There are two main MST algorithms, namely Kruskal’s and Prim’s.


Kruskal’s algorithm creates the MST by adding the edge with minimum cost to a growing set. The algorithm first sorts edges by their weight and then adds edges to the MST starting from the minimum.

It’s important to note that the algorithm doesn’t add edges that form a cycle. Kruskal’s algorithm is preferred for sparse graphs.

Prim’s Algorithm also uses a greedy property and is ideal for dense graphs. The central idea in Prim’s MST is to have two distinct sets of vertices; one set contains the growing MST, whereas the other contains unused vertices. On each iteration, the minimum weight edge that will connect the two sets is selected.

Minimum spanning tree algorithms are essential for cluster analysis, taxonomy, and broadcast networks.

An Efficient Programmer Is Proficient With Algorithms

Programmers constantly learn and develop their skills, and there are some core essentials that everyone needs to be proficient in. A skilled programmer knows the different algorithms, the benefits and drawbacks of each, and which algorithm would be most appropriate for a given scenario.

MUO – Feed

Spider-Man: No Way Home (Teaser Trailer)

https://theawesomer.com/photos/2021/08/spider_man_far_from_home_teaser_t.jpg

Spider-Man: No Way Home (Teaser Trailer)

Link

After Mysterio reveals Spider-Man’s true identity, Peter Parker’s life is in chaos. So he asks his friend Doctor Strange to make everyone forget his secret identity. But Peter’s indecisiveness causes Doc’s spell to go wrong, opening a rift in space-time. No Way Home looks like a thrilling gateway to the Multiverse of Madness.

The Awesomer

Laravel-Medialibrary – Associate files with Eloquent models

Spatie has introduced an exceptional package called Laravel Media library. This package can incorporate all kinds of files with Eloquent models. It offers a straightforward and fluent API for working. The storage of the uploaded files is managed by Laravel’s Filesystem. So You can easily store large files on another filesystem. The Pro version of the package offers Blade, Vue, and React components to manage uploads to the media library and to handle the content of a media library collection.

Requirements

  • Laravel Media Library requires PHP 7.4+ and Laravel 7+.
  • This package uses JSON columns.
  • MySQL 5.7 or higher is required.

Installation & Setup

You can install the Media library via Composer. If you want to utilize the base package, then run this command:

composer require "spatie/laravel-medialibrary:^9.0.0"

If you have a license for media library pro, you can install laravel-media-library-pro by running this command:

composer require spatie/laravel-medialibrary-pro

Preparing the database

After installation, You need to publish the migration to generate the media table:

php artisan vendor:publish --provider="Spatie\MediaLibrary\MediaLibraryServiceProvider" --tag="migrations"

After that, you need to run this command.

php artisan migrate

Publishing the config file

Publishing the config file is optional, You can run this command for publishing the config file:

php artisan vendor:publish --php artisan vendor:publish --provider="Spatie\MediaLibrary\MediaLibraryServiceProvider" --tag="config"

Adding a media disk

By default, the media library will store its files on Laravel’s public disk. If you want a specific disk, you should add a disk to config/filesystems.php. You’ve to add this configuration:

    ...
    'disks' => [
        ...

        'media' => [
            'driver' => 'local',
            'root'   => public_path('media'),
            'url'    => env('APP_URL').'/media',
        ],
    ...

Don’t forget to .gitignore the directory of your configured disk, So the files won’t end up in your git repo.

To store all media on that disk by default, you need to set the disk_name config value in the media-library config file to the name of the disk that you’ve added.

// config/media-library.php

return [
    'disk_name' => 'media',

    // ...
];

Code Examples!

Here are some quick code examples:

$product = Product::find(1);
$product ->addMedia($pathToFile)->toMediaCollection('images');

It can also directly handle your uploads:

$product ->addMediaFromRequest('image')->toMediaCollection('images');

If you want to store some large files on another filesystem then you’ve to follow this code:

$product->addMedia($smallFile)->toMediaCollection('downloads', 'local'); 
$product->addMedia($bigFile)->toMediaCollection('downloads', 's3');

The storage of the files is managed by Laravel’s Filesystem, so you can plug in any compatible filesystem. This package can also generate derived images such as thumbnails for images, videos, and pdfs. Once you’ve set up your model, they’re easily accessible:

$product->getMedia('images')->first()->getUrl('thumb');

If you want to dig more then, you can visit the comprehensive documentation of this package with the examples on Github.

Laravel News Links

Comic for August 22, 2021

https://assets.amuniversal.com/107b71f0c72101396b35005056a9545d

Thank you for voting.

Hmm. Something went wrong. We will take a look as soon as we can.

Dilbert Daily Strip

17 Ways to Get Your Website Ready to Win

https://ashallendesign.ams3.digitaloceanspaces.com/public/blog/30/17-ways-to-get-your-website-to-win.png

Introduction

Your website is a really important tool! It can help you to grow your business if it’s built and maintained correctly.

In this article, I’ve put together 17 different things that you can do to get your website ready to win! You might already know about some of these points, but there might be one that you haven’t thought of before.

1. Make Sure Your Pages Load Quickly

A key to driving traffic to your website is by ranking higher in the search engines. Google uses the page loading speed as a ranking factor. So, by making your website load faster, you’ll be able to give yourself a nice boost up the rankings.

Having a fast-loading page is also really important for your actual visitors. Having a page that loads slow can discourage your visitors from staying on your page and can possibly cause them to leave.

A useful tool for analysing your page load times is Google PageSpeed Insights.

For any of my Laravel developer readers, you might find my article about “6 Quick & Easy Ways to Speed Up Your Laravel Website” useful.

2. Make Sure Your Website Is Using HTTPS

Your website should be using HTTPS rather than HTTP. By using it, it means that you are allowing your visitors to connect over an encrypted connection.

It’s really important to make sure that you’re using HTTPS because Google penalise any websites not using it.

As well as this, it also gives your visitors confidence to put their payment details into your website.

3. Carry Out an SEO Audit Now

The more visitors you get to your website, the more likely you are to make a sale! So, carry out an SEO (search engine optimisation) audit on your site. Use this to identify aspects of your website that you can improve to rank higher on Google.

You can use free tools such as SEOptimer and Neil Patel’s SEO Analyzer to highlight any obvious places that you can improve on. But, if you’re looking at getting an in-depth audit of your website’s SEO, I’d recommend getting in touch with an agency or freelancer who specialises in SEO.

4. Add ‘Live Chat’ or a ‘Chat Bot’

We live in a world where we can get instant answers to our questions. Why should your website be any different?

Don’t make your users sit around waiting for an email response. Give them immediate answers with a chat bot or live chat. By giving your visitors a quick way of contacting you, you can reduce the chances that they’ll go to another website to find their answer.

There are several different chat systems that you can add to your site such as: Zendesk, Tawk.to, Zoho, and Facebook Messenger.

5. Make Sure Your Website is Responsive

This point cannot be emphasised enough! Almost 60% of all internet access is done using mobile phones. So we can probably make the assumption large majority of your visitors are going to be on their smart phones. Of course, if you have analytics set up on your website, you can get more accurate figures on the types of browsers and devices that your visitors are using.

So, it’s important that the site looks great and works well on mobile devices, tablets and desktops.

6. Add Your Website To All Of Your Social Media Profiles

A large proportion of your target market are going to frequently use social media sites and apps. So it’s important to make sure that you have a page for your business on all of the platforms. This will help people find you!

Make sure you also add a link back to your website where possible so that you can direct traffic there to make a sale.

It’s also important to make sure that you don’t spread yourself too thin and try to manage too many social media accounts at once. If you’re needing help with your social media accounts, I’d highly recommend getting in touch with a social media marketing agency who will be able to help you out.

7. Reduce Clutter in Your Pages and Navigation

Website visitors can easily get confused if there’s too much clutter. Giving a user 20 different links in your navigation will just overwhelm them.

Try to aim for no more than 7-9 navigation links. The less choices that a user is given, the quicker they can make their decision. And the quicker that a user can make a decision, the more likely they are to make a purchase. Hesitation and delay kills sales!

8. Give Your Website a Face Lift

This doesn’t apply for every website. But it’s likely that some of you reading this won’t have updated your website for the past year or two.

It’s important that you keep your website looking modern and fresh. If your website looks neglected, dated and stale, what does that say about you and your company? This doesn’t necessarily mean that you need a full website redesign every couple of years, but making minor changes to your website’s styling to keep up with current trends can work wonders.

9. Show Off Your Testimonials and Past Projects

As humans, we rely on “social proof”. We like knowing that someone has bought something before us and would recommend the product/service. That’s why we loving checking for 5-star ratings before buying anything online.

Show off your testimonials online! This lets your potential customers know that they’re making the right decision choosing you and not your competitor.

If you don’t have any testimonials so far, maybe try and reach out to your existing clients/customers and asking them if they’d be able to give you a sentence or two as a review? Depending on the type of website that you’re running, it might even be helpful to get some a review on Google, Trustpilot, Facebook and LinkedIn.

10. Special Offers/Discounts for New Customers

Getting new customers and clients is much harder than keeping recurring ones. You’ve got to build the initial trust and convince them to buy from you.

Offering a first-time buyer discount or special offer is a great way to get new customers on board. Next time they’re looking for a product/service like yours, they’re more likely to come to you rather than a competitor.

11. Invest in Better Photography and Images

A large amount of websites use stock images and photography. This has a knock-on effect and makes your website look very “cookie-cutter”.

Why not invest a little bit and get bespoke images and graphics for your website that show off you or your company more? The more bespoke imagery can definitely help your website to stick in your visitors’ heads for longer and make it feel unique.

12. Start a Blog

If you haven’t already started a blog, then start one. This is a great way to drive traffic to your website. It’s also a way of you showing your readers/potential customers that you are knowledgeable in your field.

You can also share your blog posts across your social media profiles to drive even more traffic to your site.

13. Start a Podcast

Podcasts are growing like crazy at the moment! The way that people are consuming content now is changing.

So, why not start a podcast for your potential customers to listen to?

To get started quickly, why not provide all of your blog posts in an audio format for people to listen to?

14. Set Up Google Analytics

It’s important that you know what your website visitors are looking at when on your website.

By using Google Analytics, you can identify the user-journey of your visitors and see which pages they’re viewing when they’re on your site. As a result of doing this, you can update your navigation or your page layout to influence your visitors’ actions if you’d prefer them to do something else.

15. Invest in a Copywriter

Writing content for your website can be harder than you’d think (or at least it was for me). You have to make sure that you’re writing the content for your website with SEO in mind so that your pages rank well in the search results. But you also need to make sure that the content is also going to convince your visitors to complete your call to action (for example, buying a product).

I originally wrote the content for my site and I was never too happy with it. So, I got in touch with a copywriter and he rewrote the content for every page. As a result of doing this, I saw a clear improvement in my Google rankings and the amount of people contacting me for web development services.

So, I’d definitely recommend getting in touch with a copywriter who can write the copy for your website for you.

16. Make Sure You Have a Clear Call to Action

What do you want your website visitors to do? Buy a product? Register? Book an appointment? Contact you?

You need to make it clear to your visitors that you want them to carry out this action. This is what you’d call your “call to action” (sometimes seen as CTA).

For example, with my website in particular, I have a button in my navigation bar saying “Contact” that is a different colour to the other links. My main goal of any visitors who visit my site looking for web development services is for them to get in contact with me. I do this because I can then talk to any leads and find out what they need from me as a developer.

Your call to action is likely going to be different to mine though. So, I think it’s important for you to look at your website and ask yourself two questions:

  1. What do I want my visitors to do on my website?
  2. Is it clear and obvious how a visitor could do this?

17. Grow Your Email List

Growing your email list can be a really great way to help your website grow. You can use it to let your visitors know about new promotions, or to notify your users about things like new blog posts or podcast episodes.

So, by using your email list, you can drive traffic back to your website. But one of the best parts of it is that the visitors coming from your emails will already have built some form of trust with you (because they’d already provided their email address to you). As a result of this, there’s a higher probability that you’ll be able to convert your visitors to paying customers or clients.

Conclusion

Hopefully this article should have given you an idea or two that you can use to improve your website.

If you’ve enjoyed reading this post and found it useful, feel free to sign up for my newsletter below so that you can get notified each time I publish a new article.

Keep on building awesome stuff! 🚀

Laravel News Links

death and gravity: Write an SQL query builder in 150 lines of Python!

https://death.andgravity.com/_file/query-builder-how/join-operator.svg

Previously

This is the fourth article in a series about
writing an SQL query builder for my feed reader library.

Today, we’ll dive into the code by rewriting it from scratch.

Think of it as part walk-through, part tutorial;
along the way, we’ll talk about:

  • API design
  • knowing when to be lazy
  • worse (and better) ways of doing things

As you read this, keep in mind that it is linear by necessity.
Development was not linear:
I tried things out, I changed my mind multiple times,
and I rewrote everything at least once.
Even now, there may be equally-good – or better – implementations;
the current one is simply good enough.

Contents

What are we trying to build? #

We want a way to build SQL strings that takes care of formatting:

>>> query = Query()
>>> query.SELECT('one').FROM('table')
<builder.Query object at 0x7fc953e60640>
>>> print(query)
SELECT
 one
FROM
 table

… and allows us to add parts incrementally:

>>> query.SELECT('two').WHERE('condition')
<builder.Query object at 0x7fc953e60640>
>>> print(query)
SELECT
 one,
 two
FROM
 table
WHERE
 condition

While not required,
I recommend reading the previous articles
to get a sense of the problem we’re trying to solve,
and the context we’re solving it in.

In short, whatever we build should:

  • support SELECT with conditional WITH, WHERE, ORDER BY, JOIN etc.
  • expose the names of the result columns (for scrolling window queries)
  • be easy to use, understand and maintain

Trade-offs #

Our solution does not exist in a void;
it exists to be used by my feed reader library.

Notably, we’re not making a general-purpose library with external users
whose needs we’re trying to anticipate;
there’s exactly one user with a pretty well-defined use case,
and strict backwards compatibility is not necessary.

This allows us to make some upfront decisions to help with maintainability:

  • No needless customization. We can change the code directly if we need to.
  • No other features except the known requirements.
    We’ll add feature as needed.
  • No effort to support other syntax than SQLite.
  • No extensive testing.
    We can rely on the exising comprehensive functional tests.
  • No SQL validation. The database does this already.
    • However, it would be nice to get at least a little error checking.
      No need for custom exceptions, any type is acceptable –
      they should come up only during development and testing anyway.

A minimal plausible solution #

Data representation #

As mentioned before,
my prototype was based on the idea that
queries can be represented as plain data structures.

Looking at a nicely formatted query,
a natural representation may come to mind:

SELECT
    one,
    two
FROM
    table
WHERE
    condition AND
    another-condition

See it?

It’s a mapping with a list of strings for each clause:

{
    'SELECT': [
        'one',
        'two',
    ],
    'FROM': [
        'table',
    ],
    'WHERE': [
        'condition',
        'another-condition',
    ],
}

Let’s use this as our starting model, and build ourselves a query builder 🙂

Classes #

We start with a class:

 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
class Query:

    keywords = [
        'WITH',
        'SELECT',
        'FROM',
        'WHERE',
        'GROUP BY',
        'HAVING',
        'ORDER BY',
        'LIMIT',
    ]

    def __init__(self):
        self.data = {k: [] for k in self.keywords}

We use a class because most of the time
we don’t want to interact with the underlying data structure,
since it’s more likely to change.
We’re not subclassing dict,
since that would unintentionally expose its methods (and thus, behavior),
and we may need those names for something else.

Also, a class allows us to reduce verbosity:

# we want
query.SELECT('one', 'two').FROM('table')
# not
query['SELECT'].extend(['one', 'two'])
query['FROM'].append('table')

We store various data as class variables
instead of hardcoding it or using module variables
as a cheap way to customize things (more on that later).
Also, using a variable makes it clearer what those things are.

For now, we refrain from any customization in the initializer;
if we need more clauses, we can add them to keywords directly.

We add all the known keywords upfront to get free error checking –
we can rely on data[keyword] raising a KeyError for unknown keywords.

We could use dataclasses,
but of the generated magic methods, we’d only use the __repr__(),
and most of the time it would be too long to be useful.

Adding things #

Next, we add code for adding stuff:

18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
    def add(self, keyword, *args):
        target = self.data[keyword]

        for arg in args:
            target.append(_clean_up(arg))

        return self

    def __getattr__(self, name):
        if not name.isupper():
            return getattr(super(), name)
        return functools.partial(self.add, name.replace('_', ' '))


def _clean_up(thing: str) -> str:
    return textwrap.dedent(thing.rstrip()).strip()

add() is roughly equivalent to data[keyword]​.extend(args).

The main difference is that we first dedent the arguments and remove trailing whitespace.
This is an upfront decision: we clean everything up and make as many choices when adding things,
so the part that generates output doesn’t have to care about any of that,
and error checking happens as early as possible.

Another difference is that it returns self,
which enables method chaining: query​.add(...)​.add(...).


__getattr__() gets called when an attribute does not exist,
and allows us to return something instead of getting the default AttributeError.

What we return is a KEYWORD(*args) callable made on the fly
by wrapping add() in a partial (this is the metaprogramming part);
a closure would be functionally equivalent.1

Requiring the keywords to be uppercase is mostly a stylistic choice,
but it does advantages:
it signals to the reader these are special "methods",
and avoids shadowing dunder methods like __deepcopy__() with no extra code.

We don’t even bother raising AttributeError explicitly,
we just let getattr() do it for us.

We could store the partial on the instance,
which would side-step __getattr__() on subsequent calls,
so we only make one partial per keyword;
we could do it upfront, in the constructor;
we could even generate actual methods,
so there’s only one per keyword per class!
Or we can do nothing – they’re likely premature optimization, and not worth the effort.


I said error checking error checking happens as early as possible;
that’s almost true:
if you look carefully at the code,
you may notice query​.ESLECT doesn’t raise an exception
until called, query​.ESLECT().

Doing most of the work in add() does some benefits, though:
it allows it to be used with partial,
and it’s an escape hatch for when we want
to use a "keyword" that’s not a Python identifier
(it’ll be useful later).

Output #

Finally, we turn what we have back into an SQL string:

18
    default_separator = ','
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
    def __str__(self):
        return ''.join(self._lines())

    def _lines(self):
        for keyword, things in self.data.items():
            if not things:
                continue

            yield f'{keyword}\n'
            yield from self._lines_keyword(keyword, things)

    def _lines_keyword(self, keyword, things):
        for i, thing in enumerate(things, 1):
            last = i == len(things)

            yield self._indent(thing)

            if not last:
                yield self.default_separator

            yield '\n'

    _indent = functools.partial(textwrap.indent, prefix=' ')

The main API is str();
this requires almost zero effort to learn,
since it’s the standard way of turning things into strings in Python.

str(query) calls __str__, which delegates the work to the _lines() generator.
We use a generator mainly because it allows writing
yield line instead of rv.append(line),
making for somewhat cleaner code.

A second benefit of a generator is that it’s lazy:
this means we can pass it around
without having to build an intermediary list or string;
for example, to an open file’s writelines() method,
or in yield from in another generator
(e.g. if we allow nested subqueries).
We don’t need any of this here,
but it can be useful when generating a lot of values.

We split the logic for individual clauses into a separate generator, _lines_keyword(),
because we’ll keep adding stuff to it.
(I initially left everything in _lines(),
and refactored when things got too complicated;
no need to do that now.)

Since we’ll want to indent things in the same way in more than one place,
we make it a static "method" using partial.

You may notice we’re not sorting the clauses in any way;
dicts guarantee insertion order in Python 3.6 or newer2,
and we add them to data upfront,
so the order in keywords is maintained.

Tests #

To make sure we don’t break stuff that’s already working,
let’s add a simple test:

 6
 7
 8
 9
10
11
12
13
14
15
16
def test_query_simple():
    query = Query().SELECT('select').FROM('from-one', 'from-two')
    assert str(query) == dedent(
        """\
 SELECT
 select
 FROM
 from-one,
 from-two
 """
    )

We’ll keep adding to it at the end of each section,
but since it’s not all that interesting feel free to skip that.


For a minimal solution, we’re done; we have 62 lines / 38 statements.

The code so far:
builder.py,
test_builder.py.

Separators #

At this point, the WHERE output doesn’t really make sense:

>>> print(Query().WHERE('a', 'b'))
WHERE
 a,
 b

We fix it by special-casing the separators for a few clauses:

18
    separators = dict(WHERE='AND', HAVING='AND')
48
49
50
51
52
53
54
55
56
57
58
59
60
    def _lines_keyword(self, keyword, things):
        for i, thing in enumerate(things, 1):
            last = i == len(things)

            yield self._indent(thing)

            if not last:
 try:
 yield ' ' + self.separators[keyword]
 except KeyError:
 yield self.default_separator

            yield '\n'

We could’ve used defaultdict and gotten rid of default_separator,
but then we’d have to remember that the non-comma separators need a space (' AND');
it’s clearer to put it directly in code.

Also, we could’ve put the special separator on a new line ('one\n​AND two' vs. 'one AND\n​two').
While this is recommended by some style guides,
it makes the code a bit more complicated for little benefit,
and (maybe) makes it less obvious that AND is just another separator.

We add WHERE to the test.

 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
def test_query_simple():
    query = (
        Query()
        .SELECT('select')
        .FROM('from-one', 'from-two')
 .WHERE('where-one', 'where-two')
    )
    assert str(query) == dedent(
        """\
 SELECT
 select
 FROM
 from-one,
 from-two
 WHERE
 where-one AND
 where-two
 """
    )

The code so far:
builder.py,
test_builder.py.

Aliases #

One of the requirements is making it possible to implement on top of our builder
scrolling window queries.

For this to happen,
code needs to get the result column names
(the SELECT expressions or their aliases),
so it can use them in the generated WHERE condition.

The output of the following looks OK, but to extract alias programmatically,
we’d need to parse column AS alias:

>>> query = Query().SELECT('column AS alias')
>>> query.data['SELECT'][0]
'column AS alias'

Passing a pair of strings when we want an aliased column seems like an acceptable thing.
Since the column expression might be quite long,
we’ll make the alias the first thing in the pair.

>>> print(Query().SELECT(('alias', 'one'), 'two'))
SELECT
 one AS alias,
 two

As mentioned earlier, we’ll store the data in cleaned up and standard form,
so the output code doesn’t have to care about that.

A 2-tuple is a decent choice,
but to make the code easier to read,
we’ll depart a tiny bit from plain data structures,
and use a named tuple instead.

66
67
68
69
70
71
72
73
74
75
76
77
78
class _Thing(NamedTuple):
    value: str
    alias: str = ''

    @classmethod
    def from_arg(cls, arg):
        if isinstance(arg, str):
            alias, value = '', arg
        elif len(arg) == 2:
            alias, value = arg
        else:
            raise ValueError(f"invalid arg: {arg!r}")
        return cls(_clean_up(value), _clean_up(alias))

Conveniently, this also gives us a place where to convert the string-or-pair,
in the form of the from_arg() alternate constructor.
We could’ve made it a stand-alone function,
but this makes clearer what’s being returned.

Note that we use an empty string to mean "no alias".
In general, it’s a good idea to distinguish this kind of absence by using None,
since the empty string may be a valid input,
and None can prevent some bugs – e.g. you can’t concatenate None to a string.
Here, an empty string cannot be a valid alias, so we don’t bother.

Using it is just a one-line change to add():

25
26
27
28
29
30
31
    def add(self, keyword, *args):
        target = self.data[keyword]

        for arg in args:
 target.append(_Thing.from_arg(arg))

        return self

On output, we have two concerns:

  1. there may or may not be an alias
  2. the order differs:
    you have SELECT expr AS column-alias,
    but WITH table-name AS (stmt)
    (we treat the CTE table name as an alias)

We can model this with mostly-empty defaultdicts with per-clause format strings:

23
24
25
26
    formats = (
        defaultdict(lambda: '{value}'),
        defaultdict(lambda: '{value} AS {alias}', WITH='{alias} AS {value}'),
    )

To choose between "has alias" and "doesn’t have alias",
we take advantage of True and False being equal to 1 and 0
(this may be too clever for our own good, but eh).

55
56
57
58
59
60
61
62
63
64
65
66
67
68
    def _lines_keyword(self, keyword, things):
        for i, thing in enumerate(things, 1):
            last = i == len(things)

 format = self.formats[bool(thing.alias)][keyword]
 yield self._indent(format.format(value=thing.value, alias=thing.alias))

            if not last:
                try:
                    yield ' ' + self.separators[keyword]
                except KeyError:
                    yield self.default_separator

            yield '\n'

We add an aliased expression to the test.

 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
def test_query_simple():
    query = (
        Query()
 .SELECT('select-one', ('alias', 'select-two'))
        .FROM('from-one', 'from-two')
        .WHERE('where-one', 'where-two')
    )
    assert str(query) == dedent(
        """\
 SELECT
 select-one,
 select-two AS alias
 FROM
 from-one,
 from-two
 WHERE
 where-one AND
 where-two
 """
    )

The code so far:
builder.py,
test_builder.py.

Subqueries #

Currently, WITH is still a little broken:

>>> print(Query().WITH(('table-name', 'SELECT 1')))
WITH
 table-name AS SELECT 1

Since common table expressions always have the SELECT statement paranthesized,
we’d like to have that out of the box, and be indented properly:

WITH
    table-name AS (
        SELECT 1
    )

A simple way of handling this is to change the WITH format string to
'{alias} AS (\n{indented}\n)',
where indented is the value, but indented.3

This kinda works, but is somewhat limited in usefulness;
for instance, we can’t easily build something like this on top:

Query().FROM(('alias', 'SELECT 1'), is_subquery=True)

Instead, let’s keep refining our model about values,
and indicate something is a subquery using a flag:

73
74
75
76
77
78
79
80
81
82
83
84
85
86
class _Thing(NamedTuple):
    value: str
    alias: str = ''
 is_subquery: bool = False

    @classmethod
 def from_arg(cls, arg, **kwargs):
        if isinstance(arg, str):
            alias, value = '', arg
        elif len(arg) == 2:
            alias, value = arg
        else:
            raise ValueError(f"invalid arg: {arg!r}")
 return cls(_clean_up(value), _clean_up(alias), **kwargs)

We can then decide if a clause always has subqueries,
and set it accordingly:

28
    subquery_keywords = {'WITH'}
33
34
35
36
37
38
39
40
41
42
43
    def add(self, keyword, *args):
        target = self.data[keyword]

 kwargs = {}
 if keyword in self.subquery_keywords:
 kwargs.update(is_subquery=True)

        for arg in args:
 target.append(_Thing.from_arg(arg, **kwargs))

        return self

Using it for output is just an extra if:

61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
    def _lines_keyword(self, keyword, things):
        for i, thing in enumerate(things, 1):
            last = i == len(things)

            format = self.formats[bool(thing.alias)][keyword]
            value = thing.value
 if thing.is_subquery:
 value = f'(\n{self._indent(value)}\n)'
            yield self._indent(format.format(value=value, alias=thing.alias))

            if not last:
                try:
                    yield ' ' + self.separators[keyword]
                except KeyError:
                    yield self.default_separator

            yield '\n'

We add WITH to our test.

 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
def test_query_simple():
    query = (
        Query()
 .WITH(('alias', 'with'))
        .SELECT('select-one', ('alias', 'select-two'))
        .FROM('from-one', 'from-two')
        .WHERE('where-one', 'where-two')
    )
    assert str(query) == dedent(
        """\
 WITH
 alias AS (
 with
 )
 SELECT
 select-one,
 select-two AS alias
 FROM
 from-one,
 from-two
 WHERE
 where-one AND
 where-two
 """
    )

The code so far:
builder.py,
test_builder.py.

Joins #

One clause that’s missing is JOIN.
And it’s important, changing your mind about what you’re selecting from
happens quite often.

JOIN is a bit more complicated,
mostly because it has different forms – JOIN, LEFT JOIN and so on;
SQLite supports at least 10 variations.

I initially handled it by special-casing,
considering any keyword that contained JOIN a separate keyword.
This has a few drawbacks, though;
aside from making the code more complicated,
it reorders some of the tables:
query​.JOIN('a')​.LEFT_JOIN('b')​.JOIN('c')
results in JOIN a JOIN b LEFT JOIN c.

A better solution is to continue refining our model.

Take a look at these railroad diagrams for the SELECT statement:

select-core (FROM clause)

select-core

join-clause

join-clause

join-operator

join-operator

You may notice the table-or-subquery followed by comma in FROM
is actually a subset of the table-or-subquery followed by join-operator in join-clause.
That is, for SQLite, a comma is just another join operator.

Put the other way around, a join operator is just another separator.

First, let’s record which join operator a value has:

82
83
84
85
86
class _Thing(NamedTuple):
    value: str
    alias: str = ''
 keyword: str = ''
    is_subquery: bool = False
29
    fake_keywords = dict(JOIN='FROM')
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
    def add(self, keyword, *args):
 keyword, fake_keyword = self._resolve_fakes(keyword)
        target = self.data[keyword]

        kwargs = {}
 if fake_keyword:
 kwargs.update(keyword=fake_keyword)
        if keyword in self.subquery_keywords:
            kwargs.update(is_subquery=True)

        for arg in args:
            target.append(_Thing.from_arg(arg, **kwargs))

        return self

 def _resolve_fakes(self, keyword):
 for part, real in self.fake_keywords.items():
 if part in keyword:
 return real, keyword
 return keyword, ''

We could’ve probably just hardcoded this in add()
(if 'JOIN' in keyword: ...),
but doing it like this makes it easier to see at a glance that
"JOIN is a fake FROM".

Using keyword as a separator is relatively straightforward:

63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
    def _lines(self):
        for keyword, things in self.data.items():
            if not things:
                continue

            yield f'{keyword}\n'

 grouped = [], []
 for thing in things:
 grouped[bool(thing.keyword)].append(thing)
 for group in grouped:
 yield from self._lines_keyword(keyword, group)

    def _lines_keyword(self, keyword, things):
        for i, thing in enumerate(things, 1):
            last = i == len(things)

 if thing.keyword:
 yield thing.keyword + '\n'

            format = self.formats[bool(thing.alias)][keyword]
            value = thing.value
            if thing.is_subquery:
                value = f'(\n{self._indent(value)}\n)'
            yield self._indent(format.format(value=value, alias=thing.alias))

 if not last and not thing.keyword:
                try:
                    yield ' ' + self.separators[keyword]
                except KeyError:
                    yield self.default_separator

            yield '\n'

We add a JOIN to the test.

 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
def test_query_simple():
    query = (
        Query()
        .WITH(('alias', 'with'))
        .SELECT('select-one', ('alias', 'select-two'))
        .FROM('from-one', 'from-two')
 .JOIN('join')
        .WHERE('where-one', 'where-two')
    )
    assert str(query) == dedent(
        """\
 WITH
 alias AS (
 with
 )
 SELECT
 select-one,
 select-two AS alias
 FROM
 from-one,
 from-two
 JOIN
 join
 WHERE
 where-one AND
 where-two
 """
    )

The code so far:
builder.py,
test_builder.py.

Distinct #

The final adjustment to our model is to support SELECT DISTINCT.

DISTINCT (or ALL) is like a flag that applies to the whole clause;
we’ll model it as such:

117
118
class _FlagList(list):
    flag: str = ''
31
32
    def __init__(self):
 self.data = {k: _FlagList() for k in self.keywords}

Since most of the time we’re OK with the default value of flag,
we don’t bother using an instance variable in __init__;
instead, we use a class variable;
setting flag on an instance will then shadow the class variable.

We set the flag based on a known list of keywords for each clause;
like with fake keywords, we pull the flag "parsing" logic into a separate method:

30
    flag_keywords = dict(SELECT={'DISTINCT', 'ALL'})
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
    def add(self, keyword, *args):
        keyword, fake_keyword = self._resolve_fakes(keyword)
 keyword, flag = self._resolve_flags(keyword)
        target = self.data[keyword]

 if flag:
 if target.flag:
 raise ValueError(f"{keyword} already has flag: {flag!r}")
 target.flag = flag

        kwargs = {}
        if fake_keyword:
            kwargs.update(keyword=fake_keyword)
        if keyword in self.subquery_keywords:
            kwargs.update(is_subquery=True)

        for arg in args:
            target.append(_Thing.from_arg(arg, **kwargs))

        return self
62
63
64
65
66
67
68
    def _resolve_flags(self, keyword):
        prefix, _, flag = keyword.partition(' ')
        if prefix in self.flag_keywords:
            if flag and flag not in self.flag_keywords[prefix]:
                raise ValueError(f"invalid flag for {prefix}: {flag!r}")
            return prefix, flag
        return keyword, ''

Using the flag for output is again straightforward:

78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
    def _lines(self):
        for keyword, things in self.data.items():
            if not things:
                continue

 if things.flag:
 yield f'{keyword} {things.flag}\n'
 else:
 yield f'{keyword}\n'

            grouped = [], []
            for thing in things:
                grouped[bool(thing.keyword)].append(thing)
            for group in grouped:
                yield from self._lines_keyword(keyword, group)

We add a SELECT DISTINCT to our test.

 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
def test_query_simple():
    query = (
        Query()
        .WITH(('alias', 'with'))
        .SELECT('select-one', ('alias', 'select-two'))
        .FROM('from-one', 'from-two')
        .JOIN('join')
        .WHERE('where-one', 'where-two')
 .SELECT_DISTINCT('select-three')
    )
    assert str(query) == dedent(
        """\
 WITH
 alias AS (
 with
 )
 SELECT DISTINCT
 select-one,
 select-two AS alias,
 select-three
 FROM
 from-one,
 from-two
 JOIN
 join
 WHERE
 where-one AND
 where-two
 """
    )

The code so far:
builder.py,
test_builder.py.

More tests #

Since now the simple test isn’t that simple anymore, we split it in two:
one with a really simple query, and one with a really complicated query.

Like so.

 6
 7
 8
 9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
def test_query_simple():
    query = Query().SELECT('select').FROM('from').JOIN('join').WHERE('where')
    assert str(query) == dedent(
        """\
 SELECT
 select
 FROM
 from
 JOIN
 join
 WHERE
 where
 """
    )


def test_query_complicated():
    """Test a complicated query:

 * order between different keywords does not matter
 * arguments of repeated calls get appended, with the order preserved
 * SELECT can receive 2-tuples
 * WHERE and HAVING arguments are separated by AND
 * JOIN arguments are separated by the keyword, and come after plain FROM
 * no-argument keywords have no effect, unless they are flags

 """
    query = (
        Query()
        .WHERE()
        .OUTER_JOIN('outer join')
        .JOIN('join')
        .LIMIT('limit')
        .JOIN()
        .ORDER_BY('first', 'second')
        .SELECT('one')
        .HAVING('having')
        .SELECT(('two', 'expr'))
        .GROUP_BY('group by')
        .FROM('from')
        .SELECT('three', 'four')
        .FROM('another from')
        .WHERE('where')
        .ORDER_BY('third')
        .OUTER_JOIN('another outer join')
        # this isn't technically valid
        .WITH('first cte')
        .GROUP_BY('another group by')
        .HAVING('another having')
        .WITH(('fancy', 'second cte'))
        .JOIN('another join')
        .WHERE('another where')
        .NATURAL_JOIN('natural join')
        .SELECT()
        .SELECT_DISTINCT()
    )
    assert str(query) == dedent(
        """\
 WITH
 (
 first cte
 ),
 fancy AS (
 second cte
 )
 SELECT DISTINCT
 one,
 expr AS two,
 three,
 four
 FROM
 from,
 another from
 OUTER JOIN
 outer join
 JOIN
 join
 OUTER JOIN
 another outer join
 JOIN
 another join
 NATURAL JOIN
 natural join
 WHERE
 where AND
 another where
 GROUP BY
 group by,
 another group by
 HAVING
 having AND
 another having
 ORDER BY
 first,
 second,
 third
 LIMIT
 limit
 """
    )

The code so far:
builder.py,
test_builder.py.

More init #

One last feature:
I’d like to reuse the cleanup and indent logic to write paranthesized lists.
Good thing __init__ doesn’t do anything yet,
and we can add such conveniences there:

32
33
34
35
36
37
38
39
40
41
    def __init__(self, data=None, separators=None):
        self.data = {}
        if data is None:
            data = dict.fromkeys(self.keywords, ())
        for keyword, args in data.items():
            self.data[keyword] = _FlagList()
            self.add(keyword, *args)

        if separators is not None:
            self.separators = separators

Using it looks like:

>>> print(Query({'(': ['one', 'two'], ')': ['']}, separators={'(': 'OR'}))
(
 one OR
 two
)

We could’ve required the data argument to have the same structure as the attribute;
however, that’s quite verbose to write,
and I’d have to instantiate _Things and clean up strings myself;
that’s nor very convenient.

Instead, we take it to mean "here’s some strings to add() for these keywords".

Note that if we ever do want to set the entire data structure,
we still can, with a tiny detour: q = Query(); q​.data​.update(...).

We add a separate test for the fancy __init__.

108
109
110
111
112
113
114
115
116
117
118
119
def test_query_init():
    query = Query({'(': ['one', 'two', 'three'], ')': ['']}, {'(': 'OR'})
    assert str(query) == dedent(
        """\
 (
 one OR
 two OR
 three
 )

 """
    )

We’re done; we have 148 lines / 101 statements.

Throughout this, even with the most basic of tests, coverage did not drop below 96%.

The final version of the code:
builder.py,
test_builder.py.

Things that didn’t make the cut #

When talking about trade-offs,
I said we’ll only add features as needed;
this may seem a bit handwavy –
how can I tell adding them won’t make the code explode?

Because I did add them;
that’s what prototyping was for.
But since they weren’t actually used, I removed them.
There’s no point in them rotting away in there.
They’re in git, we can always add them back later.

Here’s how you’d go about implementing a few of them.

Insert / update / delete #

Make them flag keywords, to support the OR $ACTION variants.

To make VALUES bake in the parentheses, set its format to ({value}).
That’s to add one values tuple at a time.

To add one column at a time, we could do this:

  • allow add()ing INSERT with arbitrary flags
  • make INSERT('column', into='table')
    a synonym of add('INSERT INTO table', 'column')
  • classify INSERT and VALUES as parens_keywords
    – like subquery_keywords, but they apply once per keyword, not per value

It’d look like this:

# first insert sets flag
query.INSERT('one', into='table').VALUES(':one')
# later we just add stuff
query.INSERT('two').VALUES(':two')

Arbitrary strings as subqueries #

Allow setting add(..., is_subquery=True); you’d then do:

query.FROM('subquery', is_subquery=True)
query.FROM('not subquery')

Query objects as subqueries #

Using Query objects as subqueries
without having to convert them explicitly to strings
would allow changing them after being add()ed.

To do it, we just need to allow _Thing.value to be a Query,
and override its is_subquery based on an isinstance() check.

Union / intersect / except #

This one goes a bit meta:

  • add a "virtual" COMPOUND keyword
  • add a new compound(keyword) method,
    which moves everything except WITH and ORDER BY to a subquery,
    and then appends it to data['COMPOUND'] with the appropriate fake keyword
  • make __getattr__ return a compound() partial for compound keywords
  • special-case COMPOUND in _lines()

That’s it for now. 🙂

Learned something new today? Share this with others, it really helps!

Want more? Get updates via email
or Atom feed.

This is my first planned series, and still a work in progress.

This means you get a say in it.
Email me your questions or comments,
and I’ll do my best to address them in one of the future articles.


  1. I’m sorry. [return]

  2. The guaranteed insertion order was actually added
    to the language specification in 3.7,
    but in 3.6 both CPython and PyPy had it as an implementation detail. [return]

  3. That’s what I did initially. [return]

Planet Python