Best Practices for really long code blocks

jpessin1 · September 17, 2018, 8:26pm

We’re starting to see some longer blocks of code/logs such as with PacBIO SmrtAnalysis as a matter of readability and style it might be nice to have the page open with that partially folded?

What are folks thoughts on this?

Does a practical method exist within markdown/discourse?
@aculich @pfaffman et al.

pfaffman · September 17, 2018, 9:34pm

NOTE: I edited that post as recommended below, so you can go check it out now. (Hopefully that’s not overstepping my bounds!)

What you want are “code fences”.

```text
lots of text goes here
```

You can also do something like

```ruby
code
```

(if you omit the text or ruby it’ll try to guess; sometimes it’s right and somethings it’s wrong and the highlighting is annoying) and get

# Zendesk importer
#
# You will need a bunch of CSV files:
#
# - users.csv
# - topics.csv (topics in Zendesk are categories in Discourse)
# - posts.csv (posts in Zendesk are topics in Discourse)
# - comments.csv (comments in Zendesk are posts in Discourse)

require 'csv'
require 'reverse_markdown'
require_relative 'base'
require_relative 'base/generic_database'

# Call it like this:
#   RAILS_ENV=production bundle exec ruby script/import_scripts/zendesk.rb DIRNAME
class ImportScripts::Zendesk < ImportScripts::Base
  OLD_DOMAIN = "https://support.example.com"
  BATCH_SIZE = 1000

  def initialize(path)
    super()

    @path = path
    @db = ImportScripts::GenericDatabase.new(@path, batch_size: BATCH_SIZE, recreate: true)
  end

  def execute
    read_csv_files

    import_categories
    import_users
    import_topics
    import_posts
  end

  def read_csv_files
    puts "", "reading CSV files"

    csv_parse("topics") do |row|
      @db.insert_category(
        id: row[:id],
        name: row[:name],
        description: row[:description],
        position: row[:position],
        url: row[:htmlurl]
      )
    end

    csv_parse("users") do |row|
      @db.insert_user(
        id: row[:id],
        email: row[:email],
        name: row[:name],
        created_at: parse_datetime(row[:createdat]),
        last_seen_at: parse_datetime(row[:lastloginat]),
        active: true
      )
    end

    csv_parse("posts") do |row|
      @db.insert_topic(
        id: row[:id],
        title: row[:title],
        raw: row[:details],
        category_id: row[:topicid],
        closed: row[:closed] == "TRUE",
        user_id: row[:authorid],
        created_at: parse_datetime(row[:createdat]),
        url: row[:htmlurl]
      )
    end

    csv_parse("comments") do |row|
      @db.insert_post(
        id: row[:id],
        raw: row[:body],
        topic_id: row[:postid],
        user_id: row[:authorid],
        created_at: parse_datetime(row[:createdat]),
        url: row[:htmlurl]
      )
    end

    @db.execute_sql(<<~SQL)
      DELETE FROM user
      WHERE NOT EXISTS(
          SELECT 1
          FROM topic
          WHERE topic.user_id = user.id
      ) AND NOT EXISTS(
          SELECT 1
          FROM post
          WHERE post.user_id = user.id
      )
    SQL

    @db.sort_posts_by_created_at
  end

  def parse_datetime(text)
    return nil if text.blank? || text == "null"
    DateTime.parse(text)
  end

  def import_categories
    puts "", "creating categories"
    rows = @db.fetch_categories

    create_categories(rows) do |row|
      {
        id: row['id'],
        name: row['name'],
        description: row['description'],
        position: row['position'],
        post_create_action: proc do |category|
          url = remove_domain(row['url'])
          Permalink.create(url: url, category_id: category.id) unless permalink_exists?(url)
        end
      }
    end
  end

  def batches
    super(BATCH_SIZE)
  end

  def import_users
    puts "", "creating users"
    total_count = @db.count_users
    last_id = ''

    batches do |offset|
      rows, last_id = @db.fetch_users(last_id)
      break if rows.empty?

      next if all_records_exist?(:users, rows.map { |row| row['id'] })

      create_users(rows, total: total_count, offset: offset) do |row|
        {
          id: row['id'],
          email: row['email'],
          name: row['name'],
          created_at: row['created_at'],
          last_seen_at: row['last_seen_at'],
          active: row['active'] == 1
        }
      end
    end
  end

  def import_topics
    puts "", "creating topics"
    total_count = @db.count_topics
    last_id = ''

    batches do |offset|
      rows, last_id = @db.fetch_topics(last_id)
      break if rows.empty?

      next if all_records_exist?(:posts, rows.map { |row| import_topic_id(row['id']) })

      create_posts(rows, total: total_count, offset: offset) do |row|
        {
          id: import_topic_id(row['id']),
          title: row['title'].present? ? row['title'].strip[0...255] : "Topic title missing",
          raw: normalize_raw(row['raw']),
          category: category_id_from_imported_category_id(row['category_id']),
          user_id: user_id_from_imported_user_id(row['user_id']) || Discourse.system_user.id,
          created_at: row['created_at'],
          closed: row['closed'] == 1,
          post_create_action: proc do |post|
            url = remove_domain(row['url'])
            Permalink.create(url: url, topic_id: post.topic.id) unless permalink_exists?(url)
          end
        }
      end
    end
  end

  def import_topic_id(topic_id)
    "T#{topic_id}"
  end

  def import_posts
    puts "", "creating posts"
    total_count = @db.count_posts
    last_row_id = 0

    batches do |offset|
      rows, last_row_id = @db.fetch_posts(last_row_id)
      break if rows.empty?

      next if all_records_exist?(:posts, rows.map { |row| row['id'] })

      create_posts(rows, total: total_count, offset: offset) do |row|
        topic = topic_lookup_from_imported_post_id(import_topic_id(row['topic_id']))

        if topic.nil?
          p "MISSING TOPIC #{row['topic_id']}"
          p row
          next
        end

        {
          id: row['id'],
          raw: normalize_raw(row['raw']),
          user_id: user_id_from_imported_user_id(row['user_id']) || Discourse.system_user.id,
          topic_id: topic[:topic_id],
          created_at: row['created_at'],
          post_create_action: proc do |post|
            url = remove_domain(row['url'])
            Permalink.create(url: url, post_id: post.id) unless permalink_exists?(url)
          end
        }
      end
    end
  end

  def normalize_raw(raw)
    raw = raw.gsub('\n', '')
    raw = ReverseMarkdown.convert(raw)
    raw
  end

  def remove_domain(url)
    url.sub(OLD_DOMAIN, "")
  end

  def permalink_exists?(url)
    Permalink.find_by(url: url)
  end

  def csv_parse(table_name)
    CSV.foreach(File.join(@path, "#{table_name}.csv"),
                headers: true,
                header_converters: :symbol,
                skip_blanks: true,
                encoding: 'bom|utf-8') { |row| yield row }
  end
end

unless ARGV[0] && Dir.exist?(ARGV[0])
  puts "", "Usage:", "", "bundle exec ruby script/import_scripts/zendesk.rb DIRNAME", ""
  exit 1
end

ImportScripts::Zendesk.new(ARGV[0]).perform

jpessin1 · September 17, 2018, 9:56pm

Awesome, (I’m just starting to see those) glad to learn more about code fencing.

We’re still learning here, that said, the edit seems moderator appropriate to me (readability).

pfaffman · September 17, 2018, 10:09pm

Does that mean “please don’t edit our users’ posts since they don’t know WTF you are and it’d be more appropriate for our moderators to do stuff like that?”

If so. . . Uh, yeah. Sorry. It won’t happen again.

jpessin1 · September 17, 2018, 11:39pm

Apologies @pfaffman that was some poor word word choice On my part, if I may try that again:

That looks great, I’m still learning here, but that that seems like a good example of how moderators can help by editing for readability. -thanks

pfaffman · September 18, 2018, 3:06pm

No worries! I wasn’t quite sure, and several times clients have been concerned even that there was an admin account that community members didn’t know. I took no offense, just wanted to be sure that I’m giving you the help you need.

Edit: AND there was no way to un-do that I’d done the edit, hence my loud mea culpa.

aculich · September 18, 2018, 4:47pm

I would like to establish a hope of a learning community here on our ask.ci site and to enable our community to edit each other’s posts the same way as is possible with StackExchange.

Users and moderators can edit posts for clarity and ease of reading— code fences contribute to that goal
Moderators should develop community guidelines that support people editing posts
Editors of posts should be willing to send a message to users (especially new users), that both direct them to the official policies and community guidelines, and to let them know that an edit was made, with useful examples that help people learn how to create for clarity on their future posts.

So, for example, when I edit someone’s post, especially if they are new to the site, I also send them a message on the site like this:

Just wanted to let you know that I edited your post to add a markdown code fence around your shell examples. See here for a discussion of that on our site: https://ask.cyberinfrastructure.org/t/best-practices-for-really-long-code-blocks/490/6

Also, to add code fences in your future posts you can do so this way:

```shell
$ ls -la
```

which will create a syntax-colored block like this:

$ ls -la

Hope that helps!

pfaffman · September 18, 2018, 5:01pm

Moderators can edit other people’s posts, as can trust level 4 users. While TL0-3 happen automatically, TL4 can be assigned only manually.

You can configure categories so that the topic is always a wiki, that doesn’t apply to the posts under that topic.

Also, you can prefix a line with 4 spaces (which can cause problems if someone pastes in text where they indicate a paragraph with a bunch of spaces!), so “space, space, space, space ls” becomes

ls

Also, you can indicate code-like stuff in a sentence backticks like

Also, you can indicate `code-like stuff` in a sentence backticks

aculich · September 19, 2018, 4:02pm

@pfaffman Thanks for the clarification on the trust levels! Over time we can discuss with @jma what sort of process we want to have for manually assigning TL4 for non-moderators as our community grows here.

pfaffman · September 19, 2018, 4:14pm

Here’s more stuff on moderation and user levels:

pfaffman · September 19, 2018, 10:08pm

Yo! Look what I just found!

There’s a site setting autohighlight all code.

An admin can see it here: https://ask.cyberinfrastructure.org/admin/site_settings/category/all_results?filter=autoh

I turned it on.

I’m not sure how right it’ll be. Here’s a test. . . . well. I"m not impressed.

Wait. Here’s what it does: " Force apply code highlighting to all preformatted code blocks even when they didn’t explicitly specify the language".

It doesn’t guess that it’s code, it just guesses what language it is. Good thing I didn’t charge you for my time on this one!

check_docker_version () {
  echo  -e "\n\n==================== DOCKER INFO ====================" | tee -a $LOG_FILE
  docker_path=`which docker.io || which docker`
  if [ -z $docker_path ]; then
    echo "Docker is not installed. Have you installed Discourse at all?"
    echo "Perhaps you're looking for ./discourse-setup ."
    echo "There is no point in continuing."
    exit
  else
    echo -n "DOCKER VERSION: " | tee -a $LOG_FILE
    docker --version | tee -a  $LOG_FILE
    echo -e "\nDOCKER PROCESSES" | tee -a $LOG_FILE
    sudo docker ps -a | tee -a $LOG_FILE
  fi
}