Parse a GET or POST string in C











up vote
3
down vote

favorite
1












This function takes a string from an HTTP POST or GET string, finds the specified variable, and allocates memory for the result and puts it in a string. The destination is given as an address of an empty pointer.



username=johndoe&password=password123


Would produce:



password123


when finding variable password.



void httpString(char **dest, char *input, const char *find) {
char *start;
char *o_input = input;
const char *o_find = find;
size_t length = 0;
size_t i = 0;
while (*input) {
if (*input == '&' || input == o_input) {
if (*input == '&') {
input++;
if (*input == 0) {
return;
}
}
while (*input == *find) {
if (*input == 0 || *find == 0) {
return;
}
input++;
find++;
if (*input == '=' && *find == 0) {
input++;
if (*input == 0) {
return;
}
start = input;
while (*input != '&' && *input) {
input++;
length++;
}
*dest = malloc(length + 1);
input = start;
while (*input != '&' && *input) {
(*dest)[i] = *input;
input++;
i++;
}
(*dest)[i] = 0;
return;
}
}
}
find = o_find;
input++;
}
}


Any feedback related to how this function can be improved would be greatly appreciated. I am worried about potential edge cases where a memory access violation could occur.










share|improve this question




















  • 1




    I'll try to do a full review tomorrow but for now I have to be pedantic : What you want to parse is called a Query String and for the sake of completness, it can contains more than simple value pairs. You can have multiples values for a key (eg, array of values) or keys without value (act like a flag).
    – Calak
    Nov 13 at 23:58










  • @Calak Thanks for letting me know. I do realize that query strings can be more complex than this, but I would likely keep inputs simple.
    – wispi
    Nov 14 at 3:55















up vote
3
down vote

favorite
1












This function takes a string from an HTTP POST or GET string, finds the specified variable, and allocates memory for the result and puts it in a string. The destination is given as an address of an empty pointer.



username=johndoe&password=password123


Would produce:



password123


when finding variable password.



void httpString(char **dest, char *input, const char *find) {
char *start;
char *o_input = input;
const char *o_find = find;
size_t length = 0;
size_t i = 0;
while (*input) {
if (*input == '&' || input == o_input) {
if (*input == '&') {
input++;
if (*input == 0) {
return;
}
}
while (*input == *find) {
if (*input == 0 || *find == 0) {
return;
}
input++;
find++;
if (*input == '=' && *find == 0) {
input++;
if (*input == 0) {
return;
}
start = input;
while (*input != '&' && *input) {
input++;
length++;
}
*dest = malloc(length + 1);
input = start;
while (*input != '&' && *input) {
(*dest)[i] = *input;
input++;
i++;
}
(*dest)[i] = 0;
return;
}
}
}
find = o_find;
input++;
}
}


Any feedback related to how this function can be improved would be greatly appreciated. I am worried about potential edge cases where a memory access violation could occur.










share|improve this question




















  • 1




    I'll try to do a full review tomorrow but for now I have to be pedantic : What you want to parse is called a Query String and for the sake of completness, it can contains more than simple value pairs. You can have multiples values for a key (eg, array of values) or keys without value (act like a flag).
    – Calak
    Nov 13 at 23:58










  • @Calak Thanks for letting me know. I do realize that query strings can be more complex than this, but I would likely keep inputs simple.
    – wispi
    Nov 14 at 3:55













up vote
3
down vote

favorite
1









up vote
3
down vote

favorite
1






1





This function takes a string from an HTTP POST or GET string, finds the specified variable, and allocates memory for the result and puts it in a string. The destination is given as an address of an empty pointer.



username=johndoe&password=password123


Would produce:



password123


when finding variable password.



void httpString(char **dest, char *input, const char *find) {
char *start;
char *o_input = input;
const char *o_find = find;
size_t length = 0;
size_t i = 0;
while (*input) {
if (*input == '&' || input == o_input) {
if (*input == '&') {
input++;
if (*input == 0) {
return;
}
}
while (*input == *find) {
if (*input == 0 || *find == 0) {
return;
}
input++;
find++;
if (*input == '=' && *find == 0) {
input++;
if (*input == 0) {
return;
}
start = input;
while (*input != '&' && *input) {
input++;
length++;
}
*dest = malloc(length + 1);
input = start;
while (*input != '&' && *input) {
(*dest)[i] = *input;
input++;
i++;
}
(*dest)[i] = 0;
return;
}
}
}
find = o_find;
input++;
}
}


Any feedback related to how this function can be improved would be greatly appreciated. I am worried about potential edge cases where a memory access violation could occur.










share|improve this question















This function takes a string from an HTTP POST or GET string, finds the specified variable, and allocates memory for the result and puts it in a string. The destination is given as an address of an empty pointer.



username=johndoe&password=password123


Would produce:



password123


when finding variable password.



void httpString(char **dest, char *input, const char *find) {
char *start;
char *o_input = input;
const char *o_find = find;
size_t length = 0;
size_t i = 0;
while (*input) {
if (*input == '&' || input == o_input) {
if (*input == '&') {
input++;
if (*input == 0) {
return;
}
}
while (*input == *find) {
if (*input == 0 || *find == 0) {
return;
}
input++;
find++;
if (*input == '=' && *find == 0) {
input++;
if (*input == 0) {
return;
}
start = input;
while (*input != '&' && *input) {
input++;
length++;
}
*dest = malloc(length + 1);
input = start;
while (*input != '&' && *input) {
(*dest)[i] = *input;
input++;
i++;
}
(*dest)[i] = 0;
return;
}
}
}
find = o_find;
input++;
}
}


Any feedback related to how this function can be improved would be greatly appreciated. I am worried about potential edge cases where a memory access violation could occur.







c parsing http url






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 14 at 0:26









200_success

127k15148410




127k15148410










asked Nov 13 at 22:15









wispi

404




404








  • 1




    I'll try to do a full review tomorrow but for now I have to be pedantic : What you want to parse is called a Query String and for the sake of completness, it can contains more than simple value pairs. You can have multiples values for a key (eg, array of values) or keys without value (act like a flag).
    – Calak
    Nov 13 at 23:58










  • @Calak Thanks for letting me know. I do realize that query strings can be more complex than this, but I would likely keep inputs simple.
    – wispi
    Nov 14 at 3:55














  • 1




    I'll try to do a full review tomorrow but for now I have to be pedantic : What you want to parse is called a Query String and for the sake of completness, it can contains more than simple value pairs. You can have multiples values for a key (eg, array of values) or keys without value (act like a flag).
    – Calak
    Nov 13 at 23:58










  • @Calak Thanks for letting me know. I do realize that query strings can be more complex than this, but I would likely keep inputs simple.
    – wispi
    Nov 14 at 3:55








1




1




I'll try to do a full review tomorrow but for now I have to be pedantic : What you want to parse is called a Query String and for the sake of completness, it can contains more than simple value pairs. You can have multiples values for a key (eg, array of values) or keys without value (act like a flag).
– Calak
Nov 13 at 23:58




I'll try to do a full review tomorrow but for now I have to be pedantic : What you want to parse is called a Query String and for the sake of completness, it can contains more than simple value pairs. You can have multiples values for a key (eg, array of values) or keys without value (act like a flag).
– Calak
Nov 13 at 23:58












@Calak Thanks for letting me know. I do realize that query strings can be more complex than this, but I would likely keep inputs simple.
– wispi
Nov 14 at 3:55




@Calak Thanks for letting me know. I do realize that query strings can be more complex than this, but I would likely keep inputs simple.
– wispi
Nov 14 at 3:55










3 Answers
3






active

oldest

votes

















up vote
2
down vote













The easiest way to improve this function is to use the C standard library. As it is, it's difficult to read; you can't understand what it does at a glance. And it's unnecessary difficult, because most of the building blocks are already available for free in <string.h>:




  • find a string in another string with strstr;

  • find a character in a string with strchr;

  • copy a string with strcpy or a substring with memcpy


Once you have simplified your code, you'll have more brain space to care for not so negligeable things such as testing memory allocations:



void httpString(char **dest, char *input, const char *find) {
char* found = strstr(input, find);
if (!found) {
printf("find not found!");
return;
}
char* assign = found + strlen(find);
if (*assign != '=') {
printf("ill-formed!");
return;
}
char* value = assign + 1;
char* end_value = strchr(value, '&');
if (!end_value) end_value = strchr(value, 0);
int length = end_value - value;

*dest = (char*) malloc(length + 1);
if (!*dest) {
printf("Not enough memory");
return;
}
memcpy(*dest, value, length);
(*dest)[length] = 0;
}





share|improve this answer



















  • 2




    In the last line, you could write (*dest)[length + 1] = '', which would make the intention a bit clearer.
    – Roland Illig
    Nov 14 at 1:02










  • @chux, RolandIllig : thanks, I've edited my answer
    – papagaga
    Nov 15 at 15:16










  • This code skips leading & detection and so will incorrectly find nope in username=johndoe&notpassword=nope&password=password123. It also incorrectly exits on username=johndoe&passwordnot=nope&password=password123
    – chux
    Nov 15 at 16:26


















up vote
1
down vote













In both GET query strings and POST bodies, the key-value pairs are percent-encoded. Therefore, username and %75%73%65%72name would be considered semantically equivalent keys, and your parser should also look for the percent-encoded variants in the input. Conversely, the function should automatically percent-decode any value that it finds, both for symmetry and utility.



Why not return the result instead of returning void?



However, I'd prefer a design that avoids malloc() altogether, because malloc() could fail, and your caller could easily forget to free() the allocated memory. Consider writing a parser that helps you iterate over the keys and values, overwriting the input with the decoded results. It's kind of ugly, but avoids malloc() altogether.



#include <stdio.h>
#include <stdlib.h>
#include <string.h>

/**
* Percent-decodes a string in-place.
*/
static void percentDecode(char *s) {
/* TODO */
}

/**
* Returns a pointer to the beginning of the a key-value pair, writing
* a NUL delimiter to the input. Advances input to the next key-value pair.
*/
char *keyValuePair(char **input) {
return strsep(input, "&");
}

/**
* Splits keyValue into two strings, and performs percent-decoding on both.
* Returns a pointer to the key, and advances keyValue to point to the value.
*/
char *extractKey(char **keyValue) {
char *key = strsep(keyValue, "=");
percentDecode(key);
percentDecode(*keyValue);
return key;
}

int main() {
char *input = strdup("username=johndoe&password=password123");
for (char *key; (key = keyValuePair(&input)); ) {
char *value = key;
if (0 == strcmp("password", extractKey(&value))) {
printf("Found %s: %sn", key, value);
}
}
free(input);
}





share|improve this answer




























    up vote
    1
    down vote













    Passwords and library functions



    Code dealing with passwords needs to be careful about calling library functions that are not secure as those functions might leave copies of data lingering who-knows-where or leak timing information. That is a good reason to not call standard functions. Still, for developing code, better to first use standard functions and then later replace with secure code.



    Flaw: Ambiguous allocation



    httpString(char **dest, ) along some paths will allocate memory for *dest, but not all. Function lacks any notification to the caller if allocation occurred or not. This is one of those "I am worried about potential edge cases".



    const



    As char *input data does not change, add const for greater applicability and potential optimizations.



    //void httpString(char **dest, char *input, const char *find) {
    // char *start;
    // char *o_input = input;

    void httpString(char **dest, const char *input, const char *find) {
    const char *start;
    const char *o_input = input;




    Minor



    No allocation check



    *dest = malloc(length + 1);
    if (*dest == NULL) {
    // do something


    Missing proto



    Add #include <stdlib.h> for malloc().



    Unneeded code



    while (*input == *find) {
    // if (*input == 0 || *find == 0) {
    if (*input == 0) {
    return;
    }


    Naming



    char *input is not that useful. Yes it is input, but input about what?



    For such searching goals, code could use boring s1, s2 like C lib strstr(const char *s1, const char *s2), yet I prefer something more illustrative.



    void httpString(char **dest, const char *src, const char *pattern)


    ... or more fun: Needle in a haystack



    void httpString(char **dest, const char *haystack, const char *needle)




    Candidate alternative:



    char *httpString(const char * restrict haystack, const char * restrict needle) {
    size_t needle_len = strlen(needle);
    while (*haystack) {
    if (*haystack++ == '&' && strncmp(haystack, needle, needle_len) == 0
    && haystack[needle_len] == '=') {
    haystack += needle_len + 1;
    size_t password_len = strcspn(haystack, "&");
    char *pw = malloc(password_len + 1u);
    if (pw == NULL) {
    return NULL; // Out of memory
    }
    pw[password_len] = '';
    return memcpy(pw, haystack, password_len);
    }
    }
    return NULL; // Not found
    }





    share|improve this answer























      Your Answer





      StackExchange.ifUsing("editor", function () {
      return StackExchange.using("mathjaxEditing", function () {
      StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
      StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
      });
      });
      }, "mathjax-editing");

      StackExchange.ifUsing("editor", function () {
      StackExchange.using("externalEditor", function () {
      StackExchange.using("snippets", function () {
      StackExchange.snippets.init();
      });
      });
      }, "code-snippets");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "196"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      convertImagesToLinks: false,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: null,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














       

      draft saved


      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f207598%2fparse-a-get-or-post-string-in-c%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      3 Answers
      3






      active

      oldest

      votes








      3 Answers
      3






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes








      up vote
      2
      down vote













      The easiest way to improve this function is to use the C standard library. As it is, it's difficult to read; you can't understand what it does at a glance. And it's unnecessary difficult, because most of the building blocks are already available for free in <string.h>:




      • find a string in another string with strstr;

      • find a character in a string with strchr;

      • copy a string with strcpy or a substring with memcpy


      Once you have simplified your code, you'll have more brain space to care for not so negligeable things such as testing memory allocations:



      void httpString(char **dest, char *input, const char *find) {
      char* found = strstr(input, find);
      if (!found) {
      printf("find not found!");
      return;
      }
      char* assign = found + strlen(find);
      if (*assign != '=') {
      printf("ill-formed!");
      return;
      }
      char* value = assign + 1;
      char* end_value = strchr(value, '&');
      if (!end_value) end_value = strchr(value, 0);
      int length = end_value - value;

      *dest = (char*) malloc(length + 1);
      if (!*dest) {
      printf("Not enough memory");
      return;
      }
      memcpy(*dest, value, length);
      (*dest)[length] = 0;
      }





      share|improve this answer



















      • 2




        In the last line, you could write (*dest)[length + 1] = '', which would make the intention a bit clearer.
        – Roland Illig
        Nov 14 at 1:02










      • @chux, RolandIllig : thanks, I've edited my answer
        – papagaga
        Nov 15 at 15:16










      • This code skips leading & detection and so will incorrectly find nope in username=johndoe&notpassword=nope&password=password123. It also incorrectly exits on username=johndoe&passwordnot=nope&password=password123
        – chux
        Nov 15 at 16:26















      up vote
      2
      down vote













      The easiest way to improve this function is to use the C standard library. As it is, it's difficult to read; you can't understand what it does at a glance. And it's unnecessary difficult, because most of the building blocks are already available for free in <string.h>:




      • find a string in another string with strstr;

      • find a character in a string with strchr;

      • copy a string with strcpy or a substring with memcpy


      Once you have simplified your code, you'll have more brain space to care for not so negligeable things such as testing memory allocations:



      void httpString(char **dest, char *input, const char *find) {
      char* found = strstr(input, find);
      if (!found) {
      printf("find not found!");
      return;
      }
      char* assign = found + strlen(find);
      if (*assign != '=') {
      printf("ill-formed!");
      return;
      }
      char* value = assign + 1;
      char* end_value = strchr(value, '&');
      if (!end_value) end_value = strchr(value, 0);
      int length = end_value - value;

      *dest = (char*) malloc(length + 1);
      if (!*dest) {
      printf("Not enough memory");
      return;
      }
      memcpy(*dest, value, length);
      (*dest)[length] = 0;
      }





      share|improve this answer



















      • 2




        In the last line, you could write (*dest)[length + 1] = '', which would make the intention a bit clearer.
        – Roland Illig
        Nov 14 at 1:02










      • @chux, RolandIllig : thanks, I've edited my answer
        – papagaga
        Nov 15 at 15:16










      • This code skips leading & detection and so will incorrectly find nope in username=johndoe&notpassword=nope&password=password123. It also incorrectly exits on username=johndoe&passwordnot=nope&password=password123
        – chux
        Nov 15 at 16:26













      up vote
      2
      down vote










      up vote
      2
      down vote









      The easiest way to improve this function is to use the C standard library. As it is, it's difficult to read; you can't understand what it does at a glance. And it's unnecessary difficult, because most of the building blocks are already available for free in <string.h>:




      • find a string in another string with strstr;

      • find a character in a string with strchr;

      • copy a string with strcpy or a substring with memcpy


      Once you have simplified your code, you'll have more brain space to care for not so negligeable things such as testing memory allocations:



      void httpString(char **dest, char *input, const char *find) {
      char* found = strstr(input, find);
      if (!found) {
      printf("find not found!");
      return;
      }
      char* assign = found + strlen(find);
      if (*assign != '=') {
      printf("ill-formed!");
      return;
      }
      char* value = assign + 1;
      char* end_value = strchr(value, '&');
      if (!end_value) end_value = strchr(value, 0);
      int length = end_value - value;

      *dest = (char*) malloc(length + 1);
      if (!*dest) {
      printf("Not enough memory");
      return;
      }
      memcpy(*dest, value, length);
      (*dest)[length] = 0;
      }





      share|improve this answer














      The easiest way to improve this function is to use the C standard library. As it is, it's difficult to read; you can't understand what it does at a glance. And it's unnecessary difficult, because most of the building blocks are already available for free in <string.h>:




      • find a string in another string with strstr;

      • find a character in a string with strchr;

      • copy a string with strcpy or a substring with memcpy


      Once you have simplified your code, you'll have more brain space to care for not so negligeable things such as testing memory allocations:



      void httpString(char **dest, char *input, const char *find) {
      char* found = strstr(input, find);
      if (!found) {
      printf("find not found!");
      return;
      }
      char* assign = found + strlen(find);
      if (*assign != '=') {
      printf("ill-formed!");
      return;
      }
      char* value = assign + 1;
      char* end_value = strchr(value, '&');
      if (!end_value) end_value = strchr(value, 0);
      int length = end_value - value;

      *dest = (char*) malloc(length + 1);
      if (!*dest) {
      printf("Not enough memory");
      return;
      }
      memcpy(*dest, value, length);
      (*dest)[length] = 0;
      }






      share|improve this answer














      share|improve this answer



      share|improve this answer








      edited Nov 15 at 15:15

























      answered Nov 13 at 23:37









      papagaga

      3,774221




      3,774221








      • 2




        In the last line, you could write (*dest)[length + 1] = '', which would make the intention a bit clearer.
        – Roland Illig
        Nov 14 at 1:02










      • @chux, RolandIllig : thanks, I've edited my answer
        – papagaga
        Nov 15 at 15:16










      • This code skips leading & detection and so will incorrectly find nope in username=johndoe&notpassword=nope&password=password123. It also incorrectly exits on username=johndoe&passwordnot=nope&password=password123
        – chux
        Nov 15 at 16:26














      • 2




        In the last line, you could write (*dest)[length + 1] = '', which would make the intention a bit clearer.
        – Roland Illig
        Nov 14 at 1:02










      • @chux, RolandIllig : thanks, I've edited my answer
        – papagaga
        Nov 15 at 15:16










      • This code skips leading & detection and so will incorrectly find nope in username=johndoe&notpassword=nope&password=password123. It also incorrectly exits on username=johndoe&passwordnot=nope&password=password123
        – chux
        Nov 15 at 16:26








      2




      2




      In the last line, you could write (*dest)[length + 1] = '', which would make the intention a bit clearer.
      – Roland Illig
      Nov 14 at 1:02




      In the last line, you could write (*dest)[length + 1] = '', which would make the intention a bit clearer.
      – Roland Illig
      Nov 14 at 1:02












      @chux, RolandIllig : thanks, I've edited my answer
      – papagaga
      Nov 15 at 15:16




      @chux, RolandIllig : thanks, I've edited my answer
      – papagaga
      Nov 15 at 15:16












      This code skips leading & detection and so will incorrectly find nope in username=johndoe&notpassword=nope&password=password123. It also incorrectly exits on username=johndoe&passwordnot=nope&password=password123
      – chux
      Nov 15 at 16:26




      This code skips leading & detection and so will incorrectly find nope in username=johndoe&notpassword=nope&password=password123. It also incorrectly exits on username=johndoe&passwordnot=nope&password=password123
      – chux
      Nov 15 at 16:26












      up vote
      1
      down vote













      In both GET query strings and POST bodies, the key-value pairs are percent-encoded. Therefore, username and %75%73%65%72name would be considered semantically equivalent keys, and your parser should also look for the percent-encoded variants in the input. Conversely, the function should automatically percent-decode any value that it finds, both for symmetry and utility.



      Why not return the result instead of returning void?



      However, I'd prefer a design that avoids malloc() altogether, because malloc() could fail, and your caller could easily forget to free() the allocated memory. Consider writing a parser that helps you iterate over the keys and values, overwriting the input with the decoded results. It's kind of ugly, but avoids malloc() altogether.



      #include <stdio.h>
      #include <stdlib.h>
      #include <string.h>

      /**
      * Percent-decodes a string in-place.
      */
      static void percentDecode(char *s) {
      /* TODO */
      }

      /**
      * Returns a pointer to the beginning of the a key-value pair, writing
      * a NUL delimiter to the input. Advances input to the next key-value pair.
      */
      char *keyValuePair(char **input) {
      return strsep(input, "&");
      }

      /**
      * Splits keyValue into two strings, and performs percent-decoding on both.
      * Returns a pointer to the key, and advances keyValue to point to the value.
      */
      char *extractKey(char **keyValue) {
      char *key = strsep(keyValue, "=");
      percentDecode(key);
      percentDecode(*keyValue);
      return key;
      }

      int main() {
      char *input = strdup("username=johndoe&password=password123");
      for (char *key; (key = keyValuePair(&input)); ) {
      char *value = key;
      if (0 == strcmp("password", extractKey(&value))) {
      printf("Found %s: %sn", key, value);
      }
      }
      free(input);
      }





      share|improve this answer

























        up vote
        1
        down vote













        In both GET query strings and POST bodies, the key-value pairs are percent-encoded. Therefore, username and %75%73%65%72name would be considered semantically equivalent keys, and your parser should also look for the percent-encoded variants in the input. Conversely, the function should automatically percent-decode any value that it finds, both for symmetry and utility.



        Why not return the result instead of returning void?



        However, I'd prefer a design that avoids malloc() altogether, because malloc() could fail, and your caller could easily forget to free() the allocated memory. Consider writing a parser that helps you iterate over the keys and values, overwriting the input with the decoded results. It's kind of ugly, but avoids malloc() altogether.



        #include <stdio.h>
        #include <stdlib.h>
        #include <string.h>

        /**
        * Percent-decodes a string in-place.
        */
        static void percentDecode(char *s) {
        /* TODO */
        }

        /**
        * Returns a pointer to the beginning of the a key-value pair, writing
        * a NUL delimiter to the input. Advances input to the next key-value pair.
        */
        char *keyValuePair(char **input) {
        return strsep(input, "&");
        }

        /**
        * Splits keyValue into two strings, and performs percent-decoding on both.
        * Returns a pointer to the key, and advances keyValue to point to the value.
        */
        char *extractKey(char **keyValue) {
        char *key = strsep(keyValue, "=");
        percentDecode(key);
        percentDecode(*keyValue);
        return key;
        }

        int main() {
        char *input = strdup("username=johndoe&password=password123");
        for (char *key; (key = keyValuePair(&input)); ) {
        char *value = key;
        if (0 == strcmp("password", extractKey(&value))) {
        printf("Found %s: %sn", key, value);
        }
        }
        free(input);
        }





        share|improve this answer























          up vote
          1
          down vote










          up vote
          1
          down vote









          In both GET query strings and POST bodies, the key-value pairs are percent-encoded. Therefore, username and %75%73%65%72name would be considered semantically equivalent keys, and your parser should also look for the percent-encoded variants in the input. Conversely, the function should automatically percent-decode any value that it finds, both for symmetry and utility.



          Why not return the result instead of returning void?



          However, I'd prefer a design that avoids malloc() altogether, because malloc() could fail, and your caller could easily forget to free() the allocated memory. Consider writing a parser that helps you iterate over the keys and values, overwriting the input with the decoded results. It's kind of ugly, but avoids malloc() altogether.



          #include <stdio.h>
          #include <stdlib.h>
          #include <string.h>

          /**
          * Percent-decodes a string in-place.
          */
          static void percentDecode(char *s) {
          /* TODO */
          }

          /**
          * Returns a pointer to the beginning of the a key-value pair, writing
          * a NUL delimiter to the input. Advances input to the next key-value pair.
          */
          char *keyValuePair(char **input) {
          return strsep(input, "&");
          }

          /**
          * Splits keyValue into two strings, and performs percent-decoding on both.
          * Returns a pointer to the key, and advances keyValue to point to the value.
          */
          char *extractKey(char **keyValue) {
          char *key = strsep(keyValue, "=");
          percentDecode(key);
          percentDecode(*keyValue);
          return key;
          }

          int main() {
          char *input = strdup("username=johndoe&password=password123");
          for (char *key; (key = keyValuePair(&input)); ) {
          char *value = key;
          if (0 == strcmp("password", extractKey(&value))) {
          printf("Found %s: %sn", key, value);
          }
          }
          free(input);
          }





          share|improve this answer












          In both GET query strings and POST bodies, the key-value pairs are percent-encoded. Therefore, username and %75%73%65%72name would be considered semantically equivalent keys, and your parser should also look for the percent-encoded variants in the input. Conversely, the function should automatically percent-decode any value that it finds, both for symmetry and utility.



          Why not return the result instead of returning void?



          However, I'd prefer a design that avoids malloc() altogether, because malloc() could fail, and your caller could easily forget to free() the allocated memory. Consider writing a parser that helps you iterate over the keys and values, overwriting the input with the decoded results. It's kind of ugly, but avoids malloc() altogether.



          #include <stdio.h>
          #include <stdlib.h>
          #include <string.h>

          /**
          * Percent-decodes a string in-place.
          */
          static void percentDecode(char *s) {
          /* TODO */
          }

          /**
          * Returns a pointer to the beginning of the a key-value pair, writing
          * a NUL delimiter to the input. Advances input to the next key-value pair.
          */
          char *keyValuePair(char **input) {
          return strsep(input, "&");
          }

          /**
          * Splits keyValue into two strings, and performs percent-decoding on both.
          * Returns a pointer to the key, and advances keyValue to point to the value.
          */
          char *extractKey(char **keyValue) {
          char *key = strsep(keyValue, "=");
          percentDecode(key);
          percentDecode(*keyValue);
          return key;
          }

          int main() {
          char *input = strdup("username=johndoe&password=password123");
          for (char *key; (key = keyValuePair(&input)); ) {
          char *value = key;
          if (0 == strcmp("password", extractKey(&value))) {
          printf("Found %s: %sn", key, value);
          }
          }
          free(input);
          }






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 14 at 1:32









          200_success

          127k15148410




          127k15148410






















              up vote
              1
              down vote













              Passwords and library functions



              Code dealing with passwords needs to be careful about calling library functions that are not secure as those functions might leave copies of data lingering who-knows-where or leak timing information. That is a good reason to not call standard functions. Still, for developing code, better to first use standard functions and then later replace with secure code.



              Flaw: Ambiguous allocation



              httpString(char **dest, ) along some paths will allocate memory for *dest, but not all. Function lacks any notification to the caller if allocation occurred or not. This is one of those "I am worried about potential edge cases".



              const



              As char *input data does not change, add const for greater applicability and potential optimizations.



              //void httpString(char **dest, char *input, const char *find) {
              // char *start;
              // char *o_input = input;

              void httpString(char **dest, const char *input, const char *find) {
              const char *start;
              const char *o_input = input;




              Minor



              No allocation check



              *dest = malloc(length + 1);
              if (*dest == NULL) {
              // do something


              Missing proto



              Add #include <stdlib.h> for malloc().



              Unneeded code



              while (*input == *find) {
              // if (*input == 0 || *find == 0) {
              if (*input == 0) {
              return;
              }


              Naming



              char *input is not that useful. Yes it is input, but input about what?



              For such searching goals, code could use boring s1, s2 like C lib strstr(const char *s1, const char *s2), yet I prefer something more illustrative.



              void httpString(char **dest, const char *src, const char *pattern)


              ... or more fun: Needle in a haystack



              void httpString(char **dest, const char *haystack, const char *needle)




              Candidate alternative:



              char *httpString(const char * restrict haystack, const char * restrict needle) {
              size_t needle_len = strlen(needle);
              while (*haystack) {
              if (*haystack++ == '&' && strncmp(haystack, needle, needle_len) == 0
              && haystack[needle_len] == '=') {
              haystack += needle_len + 1;
              size_t password_len = strcspn(haystack, "&");
              char *pw = malloc(password_len + 1u);
              if (pw == NULL) {
              return NULL; // Out of memory
              }
              pw[password_len] = '';
              return memcpy(pw, haystack, password_len);
              }
              }
              return NULL; // Not found
              }





              share|improve this answer



























                up vote
                1
                down vote













                Passwords and library functions



                Code dealing with passwords needs to be careful about calling library functions that are not secure as those functions might leave copies of data lingering who-knows-where or leak timing information. That is a good reason to not call standard functions. Still, for developing code, better to first use standard functions and then later replace with secure code.



                Flaw: Ambiguous allocation



                httpString(char **dest, ) along some paths will allocate memory for *dest, but not all. Function lacks any notification to the caller if allocation occurred or not. This is one of those "I am worried about potential edge cases".



                const



                As char *input data does not change, add const for greater applicability and potential optimizations.



                //void httpString(char **dest, char *input, const char *find) {
                // char *start;
                // char *o_input = input;

                void httpString(char **dest, const char *input, const char *find) {
                const char *start;
                const char *o_input = input;




                Minor



                No allocation check



                *dest = malloc(length + 1);
                if (*dest == NULL) {
                // do something


                Missing proto



                Add #include <stdlib.h> for malloc().



                Unneeded code



                while (*input == *find) {
                // if (*input == 0 || *find == 0) {
                if (*input == 0) {
                return;
                }


                Naming



                char *input is not that useful. Yes it is input, but input about what?



                For such searching goals, code could use boring s1, s2 like C lib strstr(const char *s1, const char *s2), yet I prefer something more illustrative.



                void httpString(char **dest, const char *src, const char *pattern)


                ... or more fun: Needle in a haystack



                void httpString(char **dest, const char *haystack, const char *needle)




                Candidate alternative:



                char *httpString(const char * restrict haystack, const char * restrict needle) {
                size_t needle_len = strlen(needle);
                while (*haystack) {
                if (*haystack++ == '&' && strncmp(haystack, needle, needle_len) == 0
                && haystack[needle_len] == '=') {
                haystack += needle_len + 1;
                size_t password_len = strcspn(haystack, "&");
                char *pw = malloc(password_len + 1u);
                if (pw == NULL) {
                return NULL; // Out of memory
                }
                pw[password_len] = '';
                return memcpy(pw, haystack, password_len);
                }
                }
                return NULL; // Not found
                }





                share|improve this answer

























                  up vote
                  1
                  down vote










                  up vote
                  1
                  down vote









                  Passwords and library functions



                  Code dealing with passwords needs to be careful about calling library functions that are not secure as those functions might leave copies of data lingering who-knows-where or leak timing information. That is a good reason to not call standard functions. Still, for developing code, better to first use standard functions and then later replace with secure code.



                  Flaw: Ambiguous allocation



                  httpString(char **dest, ) along some paths will allocate memory for *dest, but not all. Function lacks any notification to the caller if allocation occurred or not. This is one of those "I am worried about potential edge cases".



                  const



                  As char *input data does not change, add const for greater applicability and potential optimizations.



                  //void httpString(char **dest, char *input, const char *find) {
                  // char *start;
                  // char *o_input = input;

                  void httpString(char **dest, const char *input, const char *find) {
                  const char *start;
                  const char *o_input = input;




                  Minor



                  No allocation check



                  *dest = malloc(length + 1);
                  if (*dest == NULL) {
                  // do something


                  Missing proto



                  Add #include <stdlib.h> for malloc().



                  Unneeded code



                  while (*input == *find) {
                  // if (*input == 0 || *find == 0) {
                  if (*input == 0) {
                  return;
                  }


                  Naming



                  char *input is not that useful. Yes it is input, but input about what?



                  For such searching goals, code could use boring s1, s2 like C lib strstr(const char *s1, const char *s2), yet I prefer something more illustrative.



                  void httpString(char **dest, const char *src, const char *pattern)


                  ... or more fun: Needle in a haystack



                  void httpString(char **dest, const char *haystack, const char *needle)




                  Candidate alternative:



                  char *httpString(const char * restrict haystack, const char * restrict needle) {
                  size_t needle_len = strlen(needle);
                  while (*haystack) {
                  if (*haystack++ == '&' && strncmp(haystack, needle, needle_len) == 0
                  && haystack[needle_len] == '=') {
                  haystack += needle_len + 1;
                  size_t password_len = strcspn(haystack, "&");
                  char *pw = malloc(password_len + 1u);
                  if (pw == NULL) {
                  return NULL; // Out of memory
                  }
                  pw[password_len] = '';
                  return memcpy(pw, haystack, password_len);
                  }
                  }
                  return NULL; // Not found
                  }





                  share|improve this answer














                  Passwords and library functions



                  Code dealing with passwords needs to be careful about calling library functions that are not secure as those functions might leave copies of data lingering who-knows-where or leak timing information. That is a good reason to not call standard functions. Still, for developing code, better to first use standard functions and then later replace with secure code.



                  Flaw: Ambiguous allocation



                  httpString(char **dest, ) along some paths will allocate memory for *dest, but not all. Function lacks any notification to the caller if allocation occurred or not. This is one of those "I am worried about potential edge cases".



                  const



                  As char *input data does not change, add const for greater applicability and potential optimizations.



                  //void httpString(char **dest, char *input, const char *find) {
                  // char *start;
                  // char *o_input = input;

                  void httpString(char **dest, const char *input, const char *find) {
                  const char *start;
                  const char *o_input = input;




                  Minor



                  No allocation check



                  *dest = malloc(length + 1);
                  if (*dest == NULL) {
                  // do something


                  Missing proto



                  Add #include <stdlib.h> for malloc().



                  Unneeded code



                  while (*input == *find) {
                  // if (*input == 0 || *find == 0) {
                  if (*input == 0) {
                  return;
                  }


                  Naming



                  char *input is not that useful. Yes it is input, but input about what?



                  For such searching goals, code could use boring s1, s2 like C lib strstr(const char *s1, const char *s2), yet I prefer something more illustrative.



                  void httpString(char **dest, const char *src, const char *pattern)


                  ... or more fun: Needle in a haystack



                  void httpString(char **dest, const char *haystack, const char *needle)




                  Candidate alternative:



                  char *httpString(const char * restrict haystack, const char * restrict needle) {
                  size_t needle_len = strlen(needle);
                  while (*haystack) {
                  if (*haystack++ == '&' && strncmp(haystack, needle, needle_len) == 0
                  && haystack[needle_len] == '=') {
                  haystack += needle_len + 1;
                  size_t password_len = strcspn(haystack, "&");
                  char *pw = malloc(password_len + 1u);
                  if (pw == NULL) {
                  return NULL; // Out of memory
                  }
                  pw[password_len] = '';
                  return memcpy(pw, haystack, password_len);
                  }
                  }
                  return NULL; // Not found
                  }






                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Nov 15 at 16:19

























                  answered Nov 15 at 15:34









                  chux

                  12.2k11342




                  12.2k11342






























                       

                      draft saved


                      draft discarded



















































                       


                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f207598%2fparse-a-get-or-post-string-in-c%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      Сан-Квентин

                      Алькесар

                      Josef Freinademetz