P2395 题解

mydcwfy · 2021-11-18 20:53:16 · 题解

调了好久，好不容易才过了，给后来人留一些经验，也纪念一下吧。

1. 题意

将 BBCode 转化为 Markdown。

[h1]...[/h1] 转化为 # ... #，指一级标题
[h2]...[/h2] 转化为 ## ... ##，指二级标题（注意这两个都有空格）
[i]...[/i] 转化为 *...*，指斜体
[b]...[/b] 转化为 __...__，指粗体（注意这两个又没有空格）
[url=A]B[/url] 转化为 [B](A)，指链接
[img=A]B[/img] 转化为 ![B](A)，指图片（注意这两个的顺序）
[quote]...[/quote] 转化为 > ...，指代码块（注意换行，这个也是最难处理的）

如果不匹配输出 Match Error，如果最后没有都匹配，输出 Unclosed Mark。Match Error 优先。

[quote] 一定不会出现歧义。可能会出现嵌套。

[,],*,_,#,>,/, 不会在除 [quote] 包裹的代码块外出现，但 / 可能在 [url=A] 或 [img=A] 中出现。但 [url=A] 或 [img=A] 中的 A 不会出现嵌套。

2. 各种情况

相信大家看到这个题的时候，直接会想到用栈来维护嵌套，这里就不多讲了。

下面给出每一个转义的一些注意事项（毒瘤的题目，坑点比较多）：

1）一级标题 / 二级标题

这两个注意要输出空格，开始的时候要输出空格，结束的时候也要输出空格。

2）斜体 / 粗体

唯一没有坑点的转义了吧……

3）链接

首先，判断的时候，如果从 [ 开始的话，匹配到要将指针向后移动 5 位，输出一个 [，然后将后面的东西存入一个数组，然后碰到 ] 结束。碰到 [/url] 之后，输出 ]( 后，然后将存的东西输出。

4）图片

这个除了上面链接的坑点之外，还有一个：在判断他是哪一个转义的时候，注意判断转义后面的字符是不是 ] 或 =，否则的话，img 会被识别为斜体。

5）代码块

这个最特殊也最麻烦……

首先，注意判断 [quote] 前面有没有 \n，如果有的话，就不需要再输出一个换行了。

每一个碰到 \n 的时候，自动输出一个 \n>，但是最后的时候要判断是不是 [/quote]，最后一个如果还是 \n 的话不需要输出，因为代码块结束的时候自己会输出一个 \n。

结束的时候，如果本身有 \n，也应该跳过，不然的话就会 Too short on line x。

6）其他

由于最后可能会出现 Unclosed Mark，所以我们前面输出的时候应该将答案存下来，最后判断有没有收尾，有的话再将所有输出。

读入可以一直 getchar()，直到 \0（还是 EOF，不大记得了 qwq）

但这种写法调试的时候，必须使用文件输入输出，请注意。

存栈的时候，可以使用数字来标识是哪个转义。

关于数组大小，没有定论，反正我开了 5\times10^6 的长度，然后栈开了 1000，可过。

遇到结束的时候，可以直接跳过 /，再进行匹配。

前 4 个可以合并一下，使代码简短（虽然感觉我的代码最长

3. Code

毕竟是大模拟，有时带字符串的，很难调，故放上整个代码，供大家参考。

有没有考虑到的地方，欢迎在评论区指出。

码风略丑，请见谅。

#include <iostream>
#include <cstring>
#include <cstdio>
#include <algorithm>
using namespace std;

const char muban[8][7] = {{"\0"}, {"h1"}, {"h2"}, {"i"}, {"b"}, {"url"}, {"img"}, {"quote"}};
const char trans_to[6][7] = {{"\0"}, {"#"}, {"##"}, "*", "__"};
char ch, a[5000000], ans[5000000], k[10000][1000];
int stk[1000], top, n, tot;
//h1 -> 1 -> # || h2 -> 2 -> ## || i -> 3 -> * || b -> 4 -> __ || url -> 5 -> []() || img -> 6 -> ![]()
//quote -> 7 -> > 

bool cmp(int pos, int id)
{
    int sz = strlen(muban[id]);
    for (int i = 0; i < sz; ++ i)
        if (a[pos + i] != muban[id][i]) return false;
    if (a[pos + sz] != ']' && a[pos + sz] != '=') return false;
    return true;
}

int find_muban(int pos)
{
    for (int i = 1; i <= 7; ++ i)
        if (cmp(pos, i)) return i;
    return -1;
}

void materr()
{
    puts("Match Error");
    exit(0);
}

void uncldmk()
{
    printf("Unclosed ");
    puts("Mark");
    exit(0);
}

void output(string s)
{
    for (int i = 0; i < (int)s.size() && s[i]; ++ i)
        ans[++ tot] = s[i];
}

void Debug_Failed()
{
    puts("Failed");
    exit(0);
}

int main()
{
    // freopen("P2395_ex.in", "r", stdin);
    // freopen("P2395_ex.out", "w", stdout);
    while ((ch = getchar()) != -1 && ch != 0) a[++ n] = ch;

    for (int i = 1; i <= n;)
    {
        if (a[i] == '[')
        {
            bool flag = 0;
            if (a[i + 1] == '/') flag = 1, i ++;
            int t = find_muban(i + 1);
            if (t <= 0) Debug_Failed();
            // cout << "match : " << t << endl;
            if (flag)
            {
                if (t != stk[top]) materr();
                if (t <= 4)
                {
                    i += strlen(muban[t]) + 2;
                    if (t <= 2) output(" ");
                    output(trans_to[t]);
                }
                if (t == 5)
                {
                    output("](");
                    output(k[top] + 1);
                    output(")");
                    i += 5;
                }
                if (t == 6)
                {
                    output("](");
                    output(k[top] + 1);
                    output(")");
                    i += 5;
                }
                top --;
                continue;
            }
            if (t <= 4)
            {
                output(trans_to[t]);
                if (t <= 2) output(" ");
                stk[++ top] = t;
                i += strlen(muban[t]) + 2;
                continue;
            }
            if (t == 5)
            {
                i += 5;
                int tmp = 0;
                output("[");
                stk[++ top] = 5;
                while (a[i] != ']') k[top][++ tmp] = a[i ++];
                i ++;
            }
            if (t == 6)
            {
                i += 5;
                int tmp = 0;
                output("![");
                stk[++ top] = 6;
                while (a[i] != ']') k[top][++ tmp] = a[i ++];
                i ++;
            }
            if (t == 7)
            {
                i += 7;
                if (a[i] == '\n') i ++;
                if (i != 8 && a[i - 8] != '\n') output("\n");
                output("> ");
                while ((!cmp(i + 2, 7) || a[i + 1] != '/'))
                {
                    if (a[i] == '\n' && cmp(i + 3, 7) && a[i + 2] == '/')
                    {
                        i ++;
                        break;
                    }
                    if (a[i] == '\n') output("\n> ");
                    else ans[++ tot] = a[i];
                    i ++;
                }
                i += 8;
                output("\n");
                if (a[i] == '\n') i ++;
            }
        }
        else
        {
            string t = " ";
            t[0] = a[i ++];//将字符转化为字符串
            output(t);
        }
    }
    if (top) uncldmk();
    cout << ans + 1;
    return 0;
}