When I first found out that Apple uses Cassandra, I mused that perhaps they would open source a Swift Cassandra client. That was in 2016. But, here we are in 2022 and it actually happened! Apple’s client is now on GitHub and I’m pumped.

What is Cassandra?

Cassandra is a NoSQL database. It does not support the traditional SQL statements or constructs like you’d find with Postgres or MySQL. Though, its CQL query language is very close. Cassandra (and NoSQL in general) makes big trade-offs between functionality and performance/availability. I think Cassandra is a tool worth at least knowing about, if you do server-side work.

I’ve written about Keyspaces before, which goes into more depth about why you might pick Cassandra over another database. Many people building low-volume, cost-sensitive systems default to DynamoDB because it has a free tier. I was looking for a way to run a very-low-cost system, and at the time, it was the only option from AWS. But, I’ve since transitioned over to Keyspaces. It also has a free tier and I think for most similar uses, it is the better choice. I’ve found it far easier to use and is at least potentially portable across cloud providers.

Running a Cluster

For better or worse, I became a Cassandra database admin during my time at Crashlytics. I learned more than I’d like to know about Cassandra operations. And, while I came to really like many things about it, I wouldn’t want to run a cluster again. I didn’t know it at the time, but Cassandra is notoriously difficult to manage successfully. This reputation was, at least when I was doing it, well-deserved.

This has motivated the development of hosted Cassandra services. These are fully managed Cassandra systems that you can just use without dealing with operations. The ones I know of are AWS Keyspaces, Astra, and on Azure.

I’ve been using Keyspaces since it was in beta, and I think it is excellent.

Using Swift

Datastax, the company that oversees the Cassandra project, maintains a number of client libraries. I spent a little time wrapping their C++ driver in Swift, and got as far as a basic proof-of-concept. But, I never took it further than that. I find it really interesting that Apple’s client is structured the same way. Though they actually did a good job and also finished.

I don’t have much experience using Swift on a server. I experimented with using a Swift-based Lambda function to read from an AWS SQS queue, and that’s it. It was maybe 20 lines of code. But I never really went further with it, mainly because of the lack of Cassandra support.

But, now this isn’t an obstacle anymore! So, let’s build a simple system that does HTTP API -> Lambda -> Keyspaces all using only Swift. And, it will all comfortably fit within the AWS free-tier limits.

Managing AWS Resources

Nearly all of the tutorials I’ve found for using AWS and Swift involve the AWS web console. But, AWS also supports programmatic configuration, via a system called CloudFormation. I learned CloudFormation pretty late, but I think it’s the way to go for even small-sized deployments. It has a pretty steep learning curve, but it is worth it. Managing stuff with the AWS console is possible, but gets untenable quite quickly.

Just don’t forget the rule: never manually delete resources created by CloudFormation.

Using the CloudFormation template I’ve produced here will allow you to both create and remove nearly all the resources needed for this example with a few clicks. There are a few things, however, that I wasn’t able to automate. So, there will still be some console usage. I’ve kind of glossed over all that, so if this is your first time using AWS, it could be tricky. Don’t be shy to reach out to me with questions!

I also want to point out that CloudFormation can be used to create IAM resources. IAM controls the security of your AWS account. The template I’ve made needs to create an IAM role. It is tightly scoped to allow just what is required, but you may still want to have a look.

Preparation

Using CloudFormation to manage the creation of an S3 buckets is tricky. So, I’m not even going to attempt that here. To get this example going, though, you’ll need one. I like to make a deploy.MY_DOMAIN bucket for CloudFormation-related resources.

Also, I haven’t yet worked out how to correctly sign requests with Apple’s Cassandra client. This means that you need to create a special IAM user with Keyspaces-specific credentials. I think this is possible with CloudFormation, but I decided not to even try, because I hope to work this out and make this all unnecessary. And that would be nice, because IAM is incredibly complicated.

So, you need to create a new IAM user manually. These are the policies that it requires, and can be added inline. Note that you need to fill in your account id (no hypens). You can also use the visual editor to create these policies, which is what I did.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "cassandra:Create",
                "cassandra:Select",
                "cassandra:Modify"
            ],
            "Resource": [
                "arn:aws:cassandra:*:YOUR_ACCOUNT_ID_GOES_HERE:/keyspace/swift_keyspaces_test/",
                "arn:aws:cassandra:*:YOUR_ACCOUNT_ID_GOES_HERE:/keyspace/swift_keyspaces_test/table/*"
            ]
        },
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": "cassandra:Select",
            "Resource": "arn:aws:cassandra:*:YOUR_ACCOUNT_ID_GOES_HERE:/keyspace/system/table/*"
        }
    ]
}

Once that’s done, generate and record the Keyspaces credentials on the “Security credentials” page. We’ll need these later.

Building

I had a surprising amount of difficulty getting a simple Lambda function working with Swift. These problems basically came down to a slightly-out-of-date tutorial, and the fact that I’m using an arm mac. I’ve tried my best to automate this all away. All the code you’ll need is open source. Swift currently cannot cross-compile, so you will also need Docker installed.

With all that, you can use some shell scripts I made to create the two artifacts we need. The first produces a Lambda layer that just packages up some needed certificates for authentication. The second actually builds the executable.

cd path/to/swift-keyspaces-repo

sh scripts/layer.sh
sh scripts/build.sh

Deploying

The artifacts that these scripts produce need to copied to that S3 bucket we mentioned earlier. You also need to copy the CloudFormation template SwiftKeyspaces.yml located in the root of the repo to that bucket as well.

With all of this in place, we can use CloudFormation to create all of the resources needed to test this out. The CloudFormation console has a wizard that guides you though the process. Here are the highlights:

  • head on over to the CloudFormation console
  • create a new stack (with new resources)
  • Use the default “Template is ready” option
  • Point it to https://s3.amazonaws.com/YOUR_BUCKET_NAME/SwiftKeyspaces.yml
  • Name it “SwiftKeyspaces”
  • Select the architecture of your local machine
  • Provide your bucket name
  • Tick the box allowing IAM access

Let it rip. I kind of enjoy watching the events go by. It’s fascinating to see everything being created. When finished, it will produce an “Output” with the URL of the HTTP API we can use. However, we have just two more manual steps to complete first. I promise, we’re almost there.

Data and Credentials

Hopefully this went smoothly. If it did not, please open up an issue in the project repo and I’ll help!

We now have two manual steps left. First, we need to put some test data into the database. Navigate over to the Keyspaces console, and select the “CQL editor”. Run the following statement:

INSERT INTO swift_keyspaces_test.test_table (key, value) VALUES ('hello', 'world')

Ok, now remember that special Keyspaces user we created? We have to get those credentials into our Lambda function. Head on over to the Lambda console, and select the SwiftKeyspaces function. From there, select “Configuration” > “Environment Variables”. Edit them, replacing the placeholders with the real values. Assuming you used my policies from above, these credentials cannot do anything but interact with our test database.

Finally

Whew. We can now finally actually use our HTTP API.

> curl https://0b8k4zfibe.execute-api.us-east-1.amazonaws.com/Test/keyspaces
[{"key":"hello","value":"world"}]

I know that was quite a lot of set up for something so trivial. But, this proof-of-concept is pretty incredible and something I’ve been wanting to try for years. I currently have a similar system using Go. It has performed very well, and has cost me basically zero. But, if I had to do it all over again, I’d definitely give Swift a shot.

I’d also genuinely love to hear about your use of Cassandra. Designing schemas is tricky, but I really enjoy it! If you have questions about anything here, please reach out!