I'm writing an utility function that returns the extension(s) of a boost::filesystem::path (v3). Boost's path class already has some of this functionality
using path = boost::filesystem::path;
path shader_file{"/var/.private/code/main.vertex.glsl"};
shader_file.extension(); // returns ".glsl"
Note that the . is included. However, path's extension() function only returns the last extension. I can't get .vertex.glsl returned.
I propose the utility function:
inline path extension( const path& p, int dots )
where dots indicates how many extensions should be returned:
extension(shader_file, 1); // returns ".glsl"
extension(shader_file, 2); // returns ".vertex.glsl"
extension(shader_file, 42); // returns ".vertex.glsl"
Note that dots may exceed the actual number of extensions (all extensions are just returned in this case). Getting all extensions is a common use case. Setting dots arbitrarily high seems wrong. Therefore, I define that for 0 >= dots all extensions are returned.
extension(shader_file, 0); // returns ".vertex.glsl"
extension(shader_file, -356); // returns ".vertex.glsl"
Here is my proposed extension function:
////////////////////////////////////////////////////////////////////////////////
/// Extension
///
/// Negative or zero "dots" value returns all extensions. E.g.:
/// extension("/usr/lib.cpp/file.vertex.shader.glsl.cache", -2) -> ".vertex.shader.glsl.cache"
/// extension("/usr/lib.cpp/file.vertex.shader.glsl.cache", -1) -> ".vertex.shader.glsl.cache"
/// extension("/usr/lib.cpp/file.vertex.shader.glsl.cache", 0) -> ".vertex.shader.glsl.cache"
///
/// Positive "dots" value returns at most "dots" extensions (counting from the end of the path). E.g.:
/// extension("/usr/lib.cpp/file.vertex.shader.glsl.cache", 1) -> ".cache"
/// extension("/usr/lib.cpp/file.vertex.shader.glsl.cache", 2) -> ".glsl.cache"
/// extension("/usr/lib.cpp/file.vertex.shader.glsl.cache", 3) -> ".shader.glsl.cache"
/// extension("/usr/lib.cpp/file.vertex.shader.glsl.cache", 4) -> ".vertex.shader.glsl.cache"
/// extension("/usr/lib.cpp/file.vertex.shader.glsl.cache", 5) -> ".vertex.shader.glsl.cache"
/// extension("/usr/lib.cpp/file.vertex.shader.glsl.cache", 6) -> ".vertex.shader.glsl.cache"
///
/// Edge cases:
/// extension("var/", 0) -> ""
/// extension("var/", 1) -> ""
/// extension("var/", 2) -> ""
/// extension("var/file", 0) -> ""
/// extension("var/file", 1) -> ""
/// extension("var/file", 2) -> ""
/// extension("var/file.", 0) -> "."
/// extension("var/file.cpp.", 0) -> ".cpp."
/// extension("var/file.cpp...abc..", 0) -> ".cpp...abc.."
/// extension("var/file.cpp...abc..", 1) -> "."
/// extension("var/file.cpp...abc..", 2) -> ".."
/// extension("var/file.cpp...abc..", 3) -> ".abc.."
/// extension("var/file.cpp...abc..", 4) -> "..abc.."
///
////////////////////////////////////////////////////////////////////////////////
inline path extension( const path& path_, int dots = 0 ) {
// Get the filename to ensure that some edge cases are dealt with. E.g.:
// path{"/var/foo.bar/baz.txt"}.filename() -> path{"baz.txt"}
auto filename = path_.filename();
const auto& native = filename.native(); // Returns const std::wstring&
// Reverse search for the nth dot
auto nth_dot = algorithm::find_last_or_nth(native.crbegin(), native.crend(), '.', dots).base();
// Compensate for reverse_iterator -> iterator conversion
if (native.cbegin() != nth_dot) --nth_dot;
return {nth_dot, native.cend()};
}
I've added additional examples in the comment block. I've cheated a bit and used path::filename() to make life easier. I know that this involves an additional copy. Note the function find_last_or_nth. This is an algorithm I've introduced to solve the problem. It is defined as follows:
////////////////////////////////////////////////////////////////////////////////
/// Find Last or nth
///
/// Let m be the number of occurences of value in the range [first, last).
///
/// 1) If m > 0 and n > 0: Returns an iterator to the min(m, n)'th occurence of
/// value in the range.
/// 2) If m > 0 and n <= 0: Returns an iterator to the the m'th occurence of value
/// in the range.
/// 3) If m = 0 : Returns last.
///
////////////////////////////////////////////////////////////////////////////////
template<typename input_iterator, typename T, typename counter>
input_iterator find_last_or_nth( input_iterator first, input_iterator last, const T& value, counter n = 0 ) {
// Default to the m = 0 case
auto result = last;
// Loops until end of the range...
// ...or n times if n > 0.
if (0 < n) ++n;
while (first != last && --n) {
// Advance to the next occurence of value
first = std::find(first, last, value);
// Not yet end of range...
if (first != last) {
// ...so store last occurrence...
result = first;
// ...and advance to the remaining range.
++first;
}
}
return result;
}
Have I missed some edge cases? What do you think of the style (I've tried to mimic STL and Boost)? Any alternative implementations?
files.tar.gz. No source is given. A Google search for "multiple extensions" returns many related results. Again, not anything official. I don't know... Do you have a suggestion for a better name? Thanks for the edit, by the way. \$\endgroup\$