Combining an external folder with _posts in Jekyll v2
I use markdown to take notes and at the same time this blog uses markdown for the content. Naturally, I wanted to have my blog posts in my notes but not have to keep the two in sync.
I had several objectives:
- The file in the notes would be used for the blog. I’d not need to copy files between the two.
- I could selectively include files to be published.
- I could have additional content in the files that would not be published.
- Images and links would work.
- I could schedule posts.
To accomplish that I:
- Created a symbolic link to the blog directory within my notes
- Added a Jekyll hook that takes the posts from that directory and combines them into the
posts
collection - Fixed links and embedded images
- Added the ability to exclude parts of the markdown file from the generated content
This is the end result:
# _plugins/filter.rb
module Filter
def self.process(site, payload)
# The link to the notes is named _linked which creates `linked` collection in Jekyll
# This takes all the files from there and keeps only the ones that have the `publish_on`
# attribute in the front matter set.
# This allows me to keep track of drafts or schedule publishing.
site.collections['linked'].docs.select!{|x| x.data['publish_on']}
now = Time.now
# Next, I loop through the collection in order inject each file into the `posts`
# collection. By default, Jekyll expects the filename to contain the date.
# Because my files do not have that their dates default to the file
# creation date. Since that isn't the order that I want to publish them
# in I've added a `publish_on` attribute to the front matter
# that is parsed and posts are then ordered by that instead.
site.collections['linked'].docs.sort_by{|x| Time.parse(x.data['publish_on'])}.each do |x|
t = Time.parse(x.data['publish_on'])
# For normal builds, I want to include only the posts that should have
# already been published. But, if the --future flag is set when running
# Jekyll, then all scheduled posts will be rendered.
if t <= now or site.config['future']
# Here, a new Jekyll::Document is created that is a copy of the linked
# document. It's then tied to the posts collection.
new_doc = Jekyll::Document.new(
x.path,
{site: site, collection: site.collections['posts']}
)
# Jekyll triggers this under the hood, although I haven't found exactly
# where. Instead, I trigger it manually. This reads the contents of the
# file and sets the front matter into the data attribute of the document.
new_doc.read
# The date is set to the parsed `publish_on` attribute.
new_doc.data['date'] = t
# I do not use the draft attribute explicitly. I use only the `publish_on`.
# This line can be excluded if you intend to have a separate attribute in
# the front matter to explicitly manage the draft status.
new_doc.data['draft'] = false
# Copy the categories and the description from the front matter.
# I'm not sure why this is not picked up immediately.
new_doc.data['categories'] = x.data['categories']
# I use `desc` in the notes but Jekyll uses `description`.
new_doc.data['description'] = x.data['desc']
# Remove the duplicate description
x.data.delete 'desc'
# Set the layout.
new_doc.data['layout'] = 'post'
# Remove any dots from the slug. Note that this isn't removing the extension
# but because some of my earlier posts used to have files with the format
# blog.category.slug
new_doc.data['slug'] = x.data['slug'].split('.').last
# Set the __coll attribute.
# While looking into this I noticed this attribute set, although I'm not sure
# if it's strictly necessary or where it's used.
new_doc.data['__coll'] = "posts"
# This line removes any content in between < !-- exclude --> and < !-- include -->
# This lets me have additional content in blog posts that I do not want
# to be included in the generated site
new_doc.content = new_doc.content.split("< !-- exclude -->")
new_doc.content = new_doc.content.reject{|x| x.include? "< !-- include -->"}.join("")
# Note that here there is a space between < and !. The actual file does
# not have the space. It's needed here because the syntax highligter will
# render each term within a span which will cause the delimiters to render
# as comments since they're no longer direct descendants of the code block.
# Secondly, this has a limitation in that it does not support having an
# exclude block at the beginning of the file.
# Fix the links to images because my blog notes are in notes/blog/<category>
# and the images are in notes/assets/images while in Jekyll images are /images
new_doc.content.gsub!("", "")
# I do maintain separate copies of images for the blog. This is mainly
# so that I can keep higher resolution images in my notes and serve
# smaller more efficient versions in the blog.
# Fix links between posts because in the blog notes it'd be
# [Link](<category>/slug.md). The category and the .md need
# to be removed so that only the slug remains.
new_doc.content.gsub!(/\[(.+)\]\(..\/(.+)\/(.+)\.md\)/, '[\1](/\3)')
# Finally, include the new document into the posts collection
site.collections['posts'].docs << new_doc
end
end
end
end
# Register the hook to run whenever files are read.
Jekyll::Hooks.register :site, :post_read do |site, payload|
Filter.process(site, payload)
end
This way I don’t have to keep the notes in sync and any notes that I add will be automatically added to the site.
If you run a site on Jekyll take a look at the #jekyll tag for other other useful Jekyll tweaks.