Finding a lost song
with
Node.js & async iterators

Luciano Mammino (@loige)

Global Summit for Node.js

2022-05-18

Get these slides!

Photo by Darius Bashar on Unsplash

 A random song you haven't listened to
in years pops into your head...

👂🐛

It doesn't matter what you do all day...
It keeps coming back to you! 🐛

And now you want to listen to it!

But, what if you can't remember

the title or the author?!

Photo by Tachina Lee on Unsplash

THERE MUST BE A WAY TO REMEMBER!

Photo by Marius Niveri on Unsplash

Today, I'll tell you how I solved this problem using

- Last.fm API

- Node.js

- Async Iterators

Let me introduce myself first...

👋 I'm Luciano (🇮🇹🍕🍝🤌)

👨‍💻 Senior Architect @ fourTheorem (Dublin 🇮🇪)

📔 Co-Author of Node.js Design Patterns  👉

Let's connect!

  loige.co (blog)

  @loige (twitter)

  loige (twitch)

  lmammino (github)

We are business focused technologists that deliver.


Accelerated Serverless | AI as a Service | Platform Modernisation

We are hiring: do you want to work with us?

So, there was this song in my mind... 🐛

I could only remember some random parts and the word "dark" (probably in the title)

Luciano - scrobbling since 12 Feb 2007

~250k scrobbles... that song must be there!

There's an API!

https://www.last.fm/api

Let's give it a shot

curl "http://ws.audioscrobbler.com/2.0/?method=user.getrecenttracks&user=loige&api_key=${API_KEY}&format=json" | jq .

It works! 🥳

Now let's do this with JavaScript

import { request } from 'undici'

const query = new URLSearchParams({
  method: 'user.getrecenttracks',
  user: 'loige',
  api_key: process.env.API_KEY,
  format: 'json'
})

const url = `https://ws.audioscrobbler.com/2.0/?${query}`
const { body } = await request(url)

const data = await body.json()
console.log(data)

We are getting a "paginated" response with 50 tracks per page

but there are 51 here! 🤔

How do we fetch the next pages?

(let's ignore this for now...)

let page = 1
while (true) {
  const query = new URLSearchParams({
    method: 'user.getrecenttracks',
    user: 'loige',
    api_key: process.env.API_KEY,
    format: 'json',
    page
  })

  const url = `https://ws.audioscrobbler.com/2.0/?${query}`
  const { body } = await request(url)
  const data = await body.json()
  console.log(data)

  if (page === Number(data.recenttracks['@attr'].totalPages)) {
    break // it's the last page!
  }

  page++
}

Seems good! 👌

Let's look at the tracks...

// ...
for (const track of data.recenttracks.track) {
  console.log(
    track.date?.['#text'],
    `${track.artist['#text']} - ${track.name}`
  )
}
console.log('--- end page ---')
// ...

* Note that page size here is 10 tracks per page

Every page has a song with undefined time...

This is the song I am currently listening to!

It appears at the top of every page.

Sometimes there are duplicated tracks between pages... 😨

The "sliding windows" problem 😩

...

tracks (newest to oldest)

image/svg+xml
image/svg+xml

Page1

Page2

...

image/svg+xml
image/svg+xml

Page1

Page2

new track

moved from page 1 to page 2

Time based windows 😎

...*

tracks (newest to oldest)

image/svg+xml
image/svg+xml

Page1

before t1

(page 1 "to" t1)

t1

t2

before t2

(page 1 "to" t2)

* we are done when we get an empty page (or num pages is 1)

image/svg+xml

to             ...          from

let to = ''
while (true) {
  const query = new URLSearchParams({
    method: 'user.getrecenttracks',
    user: 'loige',
    api_key: process.env.API_KEY,
    format: 'json',
    limit: '10',
    to
  })
  const url = `https://ws.audioscrobbler.com/2.0/?${query}`
  const { body } = await request(url)

  const data = await body.json()
  const tracks = data.recenttracks.track

  console.log(
    `--- ↓ page to ${to}`,
    `remaining pages: ${data.recenttracks['@attr'].totalPages} ---`
  )
  
  for (const track of tracks) {
    console.log(track.date?.uts, `${track.artist['#text']} - ${track.name}`)
  }

  if (data.recenttracks['@attr'].totalPages <= 1) {
    break // it's the last page!
  }

  const lastTrackInPage = tracks[tracks.length - 1]
  to = lastTrackInPage.date.uts
}

The track of the last timestamp becomes the boundary for the next page

We have a working solution! 🎉
Can we generalise it?

We know how to iterate over every page/track.
How do we expose this information?

const reader = LastFmRecentTracks({
  apikey: process.env.API_KEY,
  user: 'loige'
})
// callbacks

reader.readPages(
  (page) => { /* ... */ }, // on page
  (err) => { /* ... */} // on completion (or error)
)
// event emitter

reader.read()
reader.on('page', (page) => { /* ... */ })
reader.on('completed', (err) => { /* ... */ })
// streams ❤️

reader.pipe(/* transform or writable stream here */)
reader.on('end', () => { /* ... */ })
reader.on('error', () => { /* ... */ })
// streams pipeline ❤️❤️

pipeline(
  reader,
  yourProcessingStream,
  (err) => {
    // handle completion or err
  }
)
// ASYNC ITERATORS! 😵


for await (const page of reader) {
  /* ... */
}

// ... do more stuff when all the 
// data is consumed
// ASYNC ITERATORS WITH ERROR HANDLING! 🤯

try {
  for await (const page of reader) {
    /* ... */
  }
} catch (err) {
  // handle errors
}

// ... do more stuff when all the 
// data is consumed

How can we build an async iterator? 🧐

Meet the iteration protocols!

Iteration concepts

Iterator

An object that acts as a cursor to iterate over blocks of data sequentially

Iterable

An object that contains data that can be iterated over sequentially

The iterator protocol

An object is an iterator if it has a next() method. Every time you call it, it returns an object with the keys done (boolean) and value.

function createCountdown (from) {
  let nextVal = from
  return {
    next () {
      if (nextVal < 0) {
        return { done: true }
      }

      return { 
        done: false,
        value: nextVal--
      }
    }
  }
}
const countdown = createCountdown(3)
console.log(countdown.next())
// { done: false, value: 3 }

console.log(countdown.next())
// { done: false, value: 2 }

console.log(countdown.next())
// { done: false, value: 1 }

console.log(countdown.next())
// { done: false, value: 0 }

console.log(countdown.next())
// { done: true }

Generator functions "produce" iterators!

function * createCountdown (from) {
  for (let i = from; i >= 0; i--) {
    yield i
  }
}
const countdown = createCountdown(3)
console.log(countdown.next())
// { done: false, value: 3 }

console.log(countdown.next())
// { done: false, value: 2 }

console.log(countdown.next())
// { done: false, value: 1 }

console.log(countdown.next())
// { done: false, value: 0 }

console.log(countdown.next())
// { done: true, value: undefined }

The iterable protocol

An object is iterable if it implements the Symbol.iterator method, a zero-argument function that returns an iterator.

function createCountdown (from) {
  let nextVal = from
  return {
    [Symbol.iterator]: () => ({
      next () {
        if (nextVal < 0) {
          return { done: true }
        }

        return { done: false, value: nextVal-- }
      }
    })
  }
}
function createCountdown (from) {
  return {
    [Symbol.iterator]: function * () {
      for (let i = from; i >= 0; i--) {
        yield i
      }
    }
  }
}
const countdown = createCountdown(3)

for (const value of countdown) {
  console.log(value)
}

// 3
// 2
// 1
// 0

OK. So far this is all synchronous iteration.
What about async? 🙄

The async iterator protocol

An object is an async iterator if it has a next() method. Every time you call it, it returns a promise that resolves to an object with the keys done (boolean) and value.

import { setTimeout } from 'timers/promises'

function createAsyncCountdown (from, delay = 1000) {
  let nextVal = from
  return {
    async next () {
      await setTimeout(delay)
      if (nextVal < 0) {
        return { done: true }
      }

      return { done: false, value: nextVal-- }
    }
  }
}
const countdown = createAsyncCountdown(3)
console.log(await countdown.next())
// { done: false, value: 3 }

console.log(await countdown.next())
// { done: false, value: 2 }

console.log(await countdown.next())
// { done: false, value: 1 }

console.log(await countdown.next())
// { done: false, value: 0 }

console.log(await countdown.next())
// { done: true }
import { setTimeout } from 'timers/promises'

// async generators "produce" async iterators!

async function * createAsyncCountdown (from, delay = 1000) {
  for (let i = from; i >= 0; i--) {
    await setTimeout(delay)
    yield i
  }
}

The async iterable protocol

An object is an async iterable if it implements the Symbol.asyncIterator method, a zero-argument function that returns an async iterator.

import { setTimeout } from 'timers/promises'

function createAsyncCountdown (from, delay = 1000) {
  return {
    [Symbol.asyncIterator]: async function * () {
      for (let i = from; i >= 0; i--) {
        await setTimeout(delay)
        yield i
      }
    }
  }
}

HOT TIP 🔥

With async generators we can create objects that are both async iterators and async iterables!

(We don't need to specify Symbol.asyncIterator explicitly!)
import { setTimeout } from 'timers/promises'

// async generators "produce" async iterators
// (and iterables!)

async function * createAsyncCountdown (from, delay = 1000) {
  for (let i = from; i >= 0; i--) {
    await setTimeout(delay)
    yield i
  }
}
const countdown = createAsyncCountdown(3)

for await (const value of countdown) {
  console.log(value)
}

Now we know how to make our LastFmRecentTracks an Async Iterable 🤩

import { request } from 'undici'


async function* createLastFmRecentTracks (apiKey, user) {
  let to = ''
  while (true) {
    const query = new URLSearchParams({
      method: 'user.getrecenttracks',
      user,
      api_key: apiKey,
      format: 'json',
      to
    })
    const url = `https://ws.audioscrobbler.com/2.0/?${query}`
    const { body } = await request(url)

    const data = await body.json()
    const tracks = data.recenttracks.track

    yield tracks

    if (data.recenttracks['@attr'].totalPages <= 1) {
      break // it's the last page!
    }

    const lastTrackInPage = tracks[tracks.length - 1]
    to = lastTrackInPage.date.uts
  }
}
const recentTracks = createLastFmRecentTracks(
  process.env.API_KEY,
  'loige'
)

for await (const page of recentTracks) {
  console.log(page)
}

Let's search for all the songs that contain the word "dark" in their title! 🧐

const recentTracks = createLastFmRecentTracks(
  process.env.API_KEY,
  'loige'
)

for await (const page of recentTracks) {
  for (const track of page) {
    if (track.name.toLowerCase().includes('dark')) {
      console.log(`${track.artist['#text']} - ${track.name}`)
    }
  }
}

OMG! This is the song! 😱
...from 9 years ago!

🔥 HOT TIP

You can make your Async Iterators "smart" by adding extra behaviours (e.g. error handling and automatic retries)

Photo by Lee Campbell on Unsplash
❤️ Thanks to @goldbergyoni, Jacek Spera, @eoins, @pelger, @gbinside, @ManuEomm, @simonplend  for reviews and suggestions.

for await (const _ of createAsyncCountdown(1_000_000)) {
  console.log("THANK YOU! 😍")
}