一千萬個為什麽

搜索

從C中的文本文件中解析值

Possible Duplicate:
Parsing text in C

假設我已經以這種格式寫入文本文件:

key1/value1
key2/value2
akey/withavalue
anotherkey/withanothervalue

我有一個鏈表,如:

struct Node
{
    char *key;
    char *value;
    struct Node *next;
};

保持價值觀。我如何讀取key1和value1?我想在緩沖區中逐行放置並使用strtok(緩沖區,'/')。那會有用嗎?還有哪些其他方法可以工作,可能更快或更不容易出錯?如果可以,請附上代碼示例!

最佳答案

由於你的問題是優化內存碎片的一個很好的選擇,這裏有一個實現,它使用一些簡單的奧術魔法將所有字符串和結構本身分配到一塊內存中。

在銷毀節點時,只需要對節點本身進行一次 free()調用。

struct Node *list = NULL, **nextp = &list;
char buffer[1024];

while (fgets(buffer, sizeof buffer, file) != NULL) {
    struct Node *node;

    node = malloc(sizeof(struct Node) + strlen(buffer) + 1);
    node->key = strtok(strcpy((char*)(node+1), buffer), "/\r\n");
    node->value = strtok(NULL, "\r\n");
    node->next = NULL;
    *nextp = node;
    nextp = &node->next;
}

說明:

有20個評論和一個無法解釋的downvote,我認為代碼需要一些解釋,特別是關於所用的技巧:

  1. Building a linked list:

    struct Node *list = NULL, **nextp = &list;
    ...
    *nextp = node;
    nextp = &node->next;
    

    This is a trick to create a linked list iteratively in forward order without having to special-case the head of the list. It uses a pointer-to-pointer to the next node. First the nextp pointer points to the list head pointer; in the first iteration, the list head is set through this pointer-to-pointer and then nextp is moved to the next pointer of that node. Subsequent iterations fill the next pointer of the last node.

  2. Single allocation:

    node = malloc(sizeof(struct Node) + strlen(buffer) + 1);
    node->key = ... strcpy((char*)(node+1), buffer) ...
    

    We have to deal with three pointers: the node itself, the key string and the value string. This usually would require three separate allocations (malloc, calloc, strdup...), and consequently free separate releases (free). Instead, in this case, the spaces of the tree elements are summed in sizeof(struct Node) + strlen(buffer) + 1 and passed to a single malloc call, which returns a single block of memory. The beginning of this block of memory is assigned to node, the structure itself. The additional memory (strlen(buffer)+1) comes right after the node, and it's address is obtained using pointer arithmetic using node+1. It is used to make a copy of the entire string read from the file ("key/value\n").

    Since malloc is called a single time for each node, a single allocation is made. It means that you don't need to call free(node->key) and free(node->value). In fact, it won't work at all. Just a single free(node) will take care of deallocating the structure and both strings in one block.

  3. Line parsing:

    node->key = strtok(strcpy((char*)(node+1), buffer), "/\r\n");
    node->value = strtok(NULL, "\r\n");
    

    The first call to strtok returns the pointer to the beginning of the buffer itself. It looks for a '/' (additionally for end-of-line markers) and breaks the string there with a NUL character. So the "key/value\n" is broken in "key" and "value\n" with a NUL character in between, and a pointer to the first is returned and stored in node->key. The second call to strtok will work upon the remaining "value\n", strip the end-of-line marker and returning a pointer to "value", which is stored in node->value.

我希望這可以清除所有關於上述解決方案的問題......這對於一個封閉的問題來說太過分了。 完整的測試代碼在這裏

轉載註明原文: 從C中的文本文件中解析值